Mapping the Extracellular and Membrane Proteome Associated with the Vasculature and the Stroma in the Embryo*

In order to map the extracellular or membrane proteome associated with the vasculature and the stroma in an embryonic organism in vivo, we developed a biotinylation technique for chicken embryo and combined it with mass spectrometry and bioinformatic analysis. We also applied this procedure to implanted tumors growing on the chorioallantoic membrane or after the induction of granulation tissue. Membrane and extracellular matrix proteins were the most abundant components identified. Relative quantitative analysis revealed differential protein expression patterns in several tissues. Through a bioinformatic approach, we determined endothelial cell protein expression signatures, which allowed us to identify several proteins not yet reported to be associated with endothelial cells or the vasculature. This is the first study reported so far that applies in vivo biotinylation, in combination with robust label-free quantitative proteomics approaches and bioinformatic analysis, to an embryonic organism. It also provides the first description of the vascular and matrix proteome of the embryo that might constitute the starting point for further developments.

The vasculature has an important role in embryonic development and tissue homeostasis (1). It is also involved in disease processes in which defective or excessive vascularization is observed, such as chronic inflammation and tumor growth (2). The extracellular matrix and stroma fibroblasts also participate in these processes and play an important role in tissue remodeling (3,4). For therapeutic intervention, the molecular repertoire of the vasculature and stroma must be known.
The chicken embryo is a model organism that has been particularly well studied in different types of experiments. For example, studies on the chicken embryo have had significant effects on developmental biology. Techniques such as the creation of a quail-chick chimera were developed to study the migration and the fate of cell populations in intact embryos (5). This has led to the elucidation of the origins and fate of neural crest cells, the discovery of hemangioblasts, and a better understanding of neural tube patterning (6,7). Furthermore, techniques for gain and loss-of-function studies, promoter analysis, and transgenesis have recently been developed, together with methods for the isolation of embryonic stem cells (8 -10). In addition, the sequencing of the chicken genome has been completed.
The chicken embryo, in particular the chorioallantoic membrane (CAM), has also been used to gain insights into the different steps of vessel formation and is an experimental in vivo tumor model for cancer research (11,12). It has been utilized for the evaluation of angiogenic and anti-angiogenic compounds within its vasculature and to study the growth, angiogenesis, and metastasis of human tumors (13)(14)(15). By implanting tumor cells onto the CAM 6 days after fertilization of the egg, a model that simulates key features of human tumor growth and vascularized tumors can be generated, allowing rapid research into human tumor progression (13)(14)(15).
Functional genomic strategies have recently been applied to characterize the vascular proteome. Neri and colleagues developed a method for the detection of antigens accessible in the blood stream, based on the terminal perfusion of tumorbearing rodents with a reactive ester of biotin (16,17). This in vivo labeling procedure is followed by the recovery of biotinylated proteins from normal organs and tumors, which are then proteolytically digested and submitted to comparative mass spectrometry analysis. This approach has been extended to the ex vivo perfusion of surgically resected human colon cancer, thereby directly revealing the overexpression of markers in the tumor matrix or vessels (18). Proteomic technologies have matured to a level enabling the accurate and reproducible quantitation of peptides and proteins. Today, with recent advances in mass spectrometry, label-free quantitative proteomics approaches are considered as reliable and efficient methods for studying protein expression level changes in complex mixtures (19).
In order to characterize the extracellular and membrane proteome associated with the vasculature and the stroma in an embryonic organism in vivo, we utilized an in vivo biotinylation technique in the chicken embryo and combined it with high-resolution mass spectrometry and bioinformatics analysis to determine expression patterns and endothelial cell expression signatures. We also performed an analysis on neovessels growing (i) after the implantation of tumor cells into the chicken CAM and (ii) within granulation tissue after tissue wounding. This is the first study reported so far that provides a description of the vascular and matrix proteome in an embryonic organism by using the in vivo biotinylation method. This might constitute the starting point for further developments.

EXPERIMENTAL PROCEDURES
Tissue Biotinylation-Brown Leghorn eggs were cultured at 38°C for 3 days; then shells were cracked and the egg contents were transferred to 10-cm-diameter cell culture dishes. Embryos were cultured for another 7 days, and then 1-cm 2 injuries were inflicted to the CAM in the form of superficial scalpel cuts and subsequent scraping-off of the epithelium. The wound area was then covered with a nylon grid. At the opposite location, the CAM epithelium was scratched with a scalpel and covered with a nylon grid, and 10 l of glioblastoma slurry (300000 cells/l) was poured onto the grid center. Petri dishes with chicken embryos were then returned to the incubator for another 6 days of incubation, and biotinylation was performed. Then the chest of the E16 embryo was opened, the right pulmonary artery was canulated, and the embryo was perfused with 15 ml heparinized Ringer's solution. After that, 15 ml of 1 mg/ml sulfo-NHS-LC-LC-biotin (1 mg/ml) was injected into the embryonic circulation and quenched by subsequent injection of Tris-glycine buffer (20). Finally, amine buffer was eluted with Ringer's solution, and embryonic tissues were cut off and analyzed. After protein extraction, lysates were mixed with streptavidin-linked beads pre-washed twice with lysis buffer. After incubation, the bead slurry was washed and transferred to Eppendorf tubes, and PNGase F and Sialidase were added for deglycosylation. The beads were then transferred back to the washing column, washed again and mixed with Ultra Pure™ water, and transferred to 1-ml syringes connected to needles clogged with cotton gauze (21). Biotinylated proteins were eluted with water heated to 70°C. Collected eluates were condensated on a SpeedVac for electrophoresis or frozen and lyophilized for subsequent mass spectrometric analysis. A more detailed description is provided in the supplemental "Experimental Procedures" section.
Protein Sample Processing-Protein samples were resuspended in Laemmli sample buffer and loaded onto a one-dimensional SDS-PAGE gel (1.5 mm by 8 cm) for one-shot analysis of the entire mixture. No fractionation was performed, and the electrophoretic migration was stopped as soon as the protein sample (10 g) entered the separating gel. The gel was briefly stained with Coomassie Blue, and a single band containing the whole sample was cut out.
Nano-LC-MS/MS Analysis-The resulting peptides were analyzed via nano-LC-MS/MS using an Ultimate3000 system (Dionex, Amsterdam, The Netherlands) coupled to an LTQ-Orbitrap Velos mass spectrometer (Thermo Fisher Scientific, Bremen, Germany). Five microliters of each sample were loaded on a C-18 pre-column (300 m inner diameter ϫ 5 mm, Dionex) at 20 l/min in 5% acetonitrile and 0.05% TFA. After 5 min of desalting, the pre-column was switched online with the analytical C-18 column (75 m inner diameter ϫ 15 cm, PepMap C18, Dionex) equilibrated in 95% solvent A (5% acetonitrile, 0.2% formic acid) and 5% solvent B (80% acetonitrile, 0.2% formic acid). Peptides were eluted using a 5% to 50% gradient of solvent B over 105 min at a 300 nl/min flow rate. The LTQ-Orbitrap Velos was operated in data-dependent acquisition mode with XCalibur software. Survey MS scans were acquired in the Orbitrap in the 300 -2000 m/z range with the resolution set to a value of 60,000. The 20 most intense ions per survey scan were selected for collision-induced dissociation fragmentation, and the resulting fragments were analyzed in the linear trap (LTQ). Dynamic exclusion was employed within 60 s to prevent repetitive selection of the same peptide.
Database Search and Data Validation-Mascot Daemon software (version 2.3.2, Matrix Science, London) was used to perform database searches, and the Extract_msn.exe macro provided with Xcalibur (version 2.0 SR2, Thermo Fisher Scientific) was used to generate peaklists. The following parameters were set for creation of the peaklists: parent ions in the mass range of 400 -4500, no grouping of MS/MS scans, and threshold at 1000. A peaklist was created for each analyzed sample, and individual Mascot (version 2.3.01) searches were performed. Data were searched against Gallus gallus (chicken) entries in the Uniprot protein database (released September 21, 2011; 145,173 sequences) or against Gallus gallus (chicken) and Homo sapiens (human) entries from the same protein database. Carbamidomethylation of cysteines was set as a fixed modification, and oxidation of methionine and modification of lysines by NHS-LC-biotin (ϩ467.5525 Da) were set as variable modifications. The specificity of trypsin digestion was set for cleavage after K or R, and two missed trypsin cleavage sites were allowed. The mass tolerances in MS and MS/MS were set to 5 ppm and 0.6 Da, respectively, and the instrument setting was specified as "ESI-Trap." In order to calculate the false discovery rate (FDR), the search was performed using the "decoy" option in Mascot.
Peptide identifications extracted from Mascot result files were validated with in-house software if their score was greater than the Mascot homology threshold (when available; otherwise the Mascot identity threshold was used) for a given Mascot p value which was adjusted to get a final peptide FDR of 5%. The FDR at the peptide level was calculated as described by Navarro and Vazquez (22). Using this method, the p value was automatically adjusted to obtain an FDR of 5% at the peptide level. Validated peptides were assembled into proteins groups following the principle of parsimony (Occam's razor), which involves the creation of a minimal list of protein groups explaining the list of peptide spectrum matches. Protein groups were then re-scored for the protein validation process. For each peptide match belonging to a protein group, the difference between its Mascot score and its homology threshold (or identity threshold) was computed for a given p value (automatically adjusted to increase the discrimination between target and decoy matches), and these "score offsets" were then summed to obtain the protein group score. Protein groups were validated based on this score to obtain an FDR of 1% at the protein level (FDR ϭ number of validated decoy hits/(number of validated target hits ϩ number of validated decoy hits) ϫ 100).
Data Quantification-The quantification of proteins was performed using the label-free module implemented in the MFPaQ software (23). For each sample, the software uses the validated identification results and extracted ion chromatograms (XICs) of the identified peptide ions in the corresponding raw nano-LC-MS files, based on their experimentally measured retention time and monoisotopic m/z values. The time value used for this process was retrieved from Mascot result files, based on an MS2 event matching the peptide ion. If several MS2 events were matched to a given peptide ion, the software checked the intensity of each corresponding precursor peak in the previous MS survey scan. The time of the MS scan that exhibited the highest precursor ion intensity was attributed to the peptide ion and then used for XIC extraction, as well as for the alignment process. Peptide ions identified in all the samples to be compared were used to build a retention time matrix in order to align LC-MS runs. If some peptide ions were sequenced via MS/MS and validated only in some of the samples to be compared, their XIC signal was extracted in the nano-LC-MS raw file of the other samples using a predicted retention time value calculated from this alignment matrix via a linear interpolation method. The quantification of peptide ions was performed based on calculated XIC area values. To perform normalization of a group of comparable runs, the software computed XIC area ratios for all the extracted signals between a reference run and all the other runs of the group and used the median of the ratios as a normalization factor. In order to perform protein relative quantification in different samples, a protein abundance index (PAI) was calculated, defined as the average of XIC area values for at most three intense reference tryptic peptides identified for the protein (the three peptides exhibiting the highest intensities across the different samples were selected as reference peptides, and these same three peptides were used to compute the PAI of the protein in each sample; if only one or two peptides were identified and quantified in the case of low-abundant proteins, the PAI was calculated based on their XIC area values). In the case of SDS-PAGE fractionation, integration of quantitative data across the fractions was performed as indicated in the text by summing the PAI values for fractions adjacent to the fraction with the best PAI (the same three consecutive fractions for all the samples to be compared). For differential studies (here, between the CAM samples and tumor samples and between the CAM samples and the wound samples), a Student's t test on the PAI values was used for statistical evaluation of the significance of variations in expression level. For proteins with missing PAI values, an in-house script inferred these values by assigning a random value according to observed intensity variation. In CAM-tumor sample comparison and in CAM and wound sample comparison, a 2-fold change and p value of 0.05 were used as combined thresholds to define biologically regulated proteins. A volcano plot is used to represent these varying proteins.
The quantification values obtained (PAI) for all proteins in CAM, intestine, kidney, and liver samples were transformed into a matrix that was used in the R software to organize the proteins into cluster groups. For the three replicates, the median values of PAI were calculated and divided by the maximum value for the same protein between the four median values (CAM, intestine, kidney, and liver). All values were then centroided via this method to obtain values between 0 and 1.
This matrix from CAM, intestine, kidney, and liver experiments (three replicates per sample) was used to generate clusters using "Ward's hierarchical cluster analysis" in R Commander (V1.6-2). The clustering parameters were set to use the Ward method for hierar-chical clustering; the distance measure was Euclidean, and the group number was set at 10.
The percentages of membrane and extracellular proteins in the different samples were evaluated using Protein Center software V3.8.2017 (Proxeon Bioinformatics, Seattle, WA).
Human-chick ortholog identification, endothelial cell expression signatures, angiogenic expression signatures, and the abstract scanning for relevance to angiogenesis were carried out using methods described elsewhere (24) (for more details, see the supplemental "Experimental Procedures" section).
All other procedures (histochemistry, immunohistochemistry, Western blotting, etc.) were done according to standard protocols. A detailed description of the materials and methods can be found in the supplemental "Experimental Procedures" section.

Implantation of Tumor Cells and Induction of Granulation
Tissue-After 10 days of culture in shell-less conditions, embryos develop CAMs with dense vasculature, as can be seen in Fig. 1A. On day 10, the U87 glioma cells were implanted onto the CAM, which grew into a highly vascularized tumor 4 days later (Fig. 1B). A wound was inflicted on another area of the CAM and was covered with a square nylon grid. As shown in Fig. 1C, newly formed blood vessels grew through the grid.
To verify the perfusion of neovessels, colloidal carbon particles were injected into CAM vein 5 days after U87 glioma cell implantation or wound incision (Figs. 1D-1F). Carbon particles localized to all normal vessels of the CAM and neovessels of both granulation and U87 tumor tissues. This indicates that the biotinylation reagent, when injected into the chicken circulation, can be delivered into the entire vasculature of normal, granulation, or xenografted tumor tissues.
Embryo Perfusion and in Vivo Biotinylation-The right pulmonary artery of E16 embryos was cannulated, and the embryo yolk, CAM, and embryo circulation connected with the pulmonary artery via the ductus arteriosus were washed and perfused with a biotin solution as described under "Experimental Procedures" (Figs. 2A-2D). The flow rate was adjusted to 1 ml/min to allow optimal perfusion and preserve vessel integrity. The biotinylation reaction was stopped by injection of Tris-buffered glycine buffer. After biotinylation, different tissues (CAM, liver, small intestine, kidney), implanted tumors, or granulation tissue were removed and processed for immunostaining or protein extraction as indicated under "Experimental Procedures" (for perfusion in real time, see the video included in the supplementary materials).
Significant biotinylation of blood vessels was seen in the different tissues (Figs. 3A-3F). Fast red staining of biotinylated proteins was found in all organs and tissues examined (liver, intestine, kidney, CAM, wound, and xenografted tumor). The staining of biotin was pronounced in all tissues and was localized to the vessels or the extracellular matrix (basal lamina, intercellular matrix). In the liver (Fig. 3A), staining was prominent around sinusoidal vessels. In the intestine (Fig. 3B), strong immunoreactivity was found in intestinal villosities and underlining tissues. In the kidney (Fig. 3C), immunoreactivity was detected around tubules and glomeruli. In the CAM, a trabecular pattern of staining was seen (Fig. 3D). Because of the leaky nature of neovessels, additional biotin reactivity was observed in and around the granulation tissue (Fig. 3E). Similarily, in xeonografted tumors, strong positive biotin staining was detected around tumor cells and in the matrix (Fig. 3F).
To visualize more precisely the staining pattern of biotin in relation to blood vessels and other tissue elements, double staining was performed using SNA isolectin (endothelial cells), anti-desmin antibodies (pericytes), or anti-␣ smooth muscle actin (smooth muscle cells and myofibroblasts) antibodies (14) (Figs. 4A-4F). In the CAM tissue, double staining for biotinylated proteins and pericytes/vascular smooth muscle cells indicated highly significant biotinylation in and around blood vessels, evidenced by the staining around pericytes (Figs. 4A and 4B). In tumors, double staining was detected around blood vessels (endothelial cells), which indicates biotinylation of proteins in the basement membrane (Figs. 4C and 4D). However, biotinylated proteins were also found at a distance from blood vessels and around tumor cells, which indicates penetration of biotinylating reagent into the tumor. In the wound, double staining of the biotinylated proteins and ␣-smooth muscle actin-positive myofibroblasts showed significant biotinylation in non-vascular structures (Fig. 4E). Furthermore, in tissue such as the kidney, biotinylation of interstitial tissue around ␣-smooth muscle actin-positive pericytes was seen (Fig. 4F). This indicates that in growing tissues like tumor and wound, in which vessels are characterized by the extensive extravasation of macromolecules, the in vivo biotinylation technique also labels distinct matrix proteins.
Purification and Identification of Biotinylated Proteins-Proteins from total tissue lysate and those purified on the streptavidin beads were separated on SDS-PAGE gels and revealed as a continuous smear with no discrete banding (lanes 1, 5, and 6, supplemental Fig. S1). Evidence for purification quality was found; except for cellular carboxylases that covalently harbor endogenous biotin, no other proteins were purified from total tissue lysates obtained from organs of non-biotinylated embryos. As another quality indicator, ␤-actin, the most abundant intracellular protein, was not detected via Western blot in a purified biotinylated protein mixture. The elution of biotinylated proteins was aided by a novel and recently described technique in which strong biotinstreptavidin bonds are dissociated in pure water heated to 70°C (21). The addition of biotin before elution prevented destabilization of the streptavidin tetramer, which prevented the overrepresentation of single proteins that would mask less well represented proteins in the subsequent mass spectrometry analysis.
For all the protein samples isolated, we performed deglycosylation and two SDS-PAGE gel separations. A "one-shot" mass spectrometry analysis was also performed by loading the samples on a one-dimensional gel without protein fractionation (the whole sample was collected in one slice of gel).

FIG. 1. Visualization of perfusion in the chicken CAM.
Sixteen-day-old embryos were injected with colloidal carbon, which visualized the functional perfusion of vessels in implanted tumor and wound, and perfusion of the CAM vessels was visualized. A, normal live CAM vessels. Only macrovessels can be identified. D, fixed CAM tissue stained for erythrocyte hemoglobin with DAB reveled a high-density network of capillaries. B, E, CAM engrafted with U87 tumor cells (B, live; E, injected with colloidal carbon and co-stained for haamoglobin). C, F, wounded CAM (C, live; F, injected with colloidal carbon and co-stained for hemoglobin). All vessels were filled with carbon particle, which indicates that thy were functionally perfused.
After trypsin digestion, the samples were injected onto a nano-LC-MSMS/LTQ-Orbitrap Velos system (see "Experimental Procedures"). Information for single-peptide-based protein identifications (sequence identified, the precursor m/z and charge, score/E-value), appropriately labeled MS/MS spectra, masses detected, and fragment assignments is provided in the supplemental material.
We performed mass spectrometry on three independent pools of the different tissues including CAM (C), wound (W), tumor (T), kidney (K), intestine (I), and liver (L). Each pool comprised seven to nine samples from independently perfused embryos. A total of 1264 proteins were detected from all organs combined, and these are listed in supplemental Table S1 (sheet S1A).
High Number of Extracellular and Membrane Proteins-When samples were analyzed with ProteinCenter software (v3.5.2.1, Proxeon Bioinformatics, Seattle, WA), a significant number of membrane and extracellular proteins were detected. Among these proteins, between 59.7% (liver) and 69.2% (intestine) were extracellular or membrane. The number of extracellular/membrane proteins identified with Gene Ontology annotation ranged between 67.76% (tumor) and 80.38% (intestine) ( Table I).
Quantification of Proteins-The quantification of proteins was performed using the label-free module implemented in the MFPaQ software (23). For the relative quantification analysis of protein expression, a label-free quantification method was chosen because it did not require the labeling of proteins (19). This method is based on integration of the MS signals of the three best peptides of each protein (PAI values) (see "Experimental Procedures"). The quantified proteomic PAI data were divided into two separate groups: (i) CAM, tumor, and wound group (CTW); and (ii) CAM, intestine, kidney, and liver group (CIKL). Data values can be found in supplemental Table S1 (sheets S1B and S1C, respectively).
Sample Reproducibility and Distribution-To test the reproducibility of biological replicates from within each tissue and between the different tissues, Pearson correlation coefficients and principal component analyses were carried out. The results of the different groups can be seen in Fig. 5. Pearson correlation results for all biological replicates indicated correlations greater than or equal to 0.84 (Figs. 5A (CTW) and 5B (CIKL)), which shows strong correlation and reproducibility of the biological replicates. Fig. 5C portrays an independent principal component plot of the CIKL pool and reveals a kidney replicate with more variation than seen in the other kidney samples (which is the minimum 0.84 Pearson correlation coefficient (Fig. 5B)). Fig. 5D shows an independent principal component analysis of the CAM, wound, and tumor samples. The largest variation in the first principal component shows that the wound and CAM samples had less global gene expression difference than the tumor samples, as expected.
In order to test the proteomic PAI data for Gaussian/normal distributions, a histogram for each pool was produced (Fig. 6). The results show a bell-like shape for both pools, characteristic of a normal distribution, which enables the use of valid parametric test statistics for differential expression with these data (Student's t test).
Human Ortholog Identification-As genes related by evolution, particularly orthologs, usually have conserved function between vertebrates, and because human is the most characterized model species, human orthologs of chicken proteins were sought in order to further investigate their biology. All except 26 proteins were assigned a human ortholog using the PAI data set, and these can be viewed in sheets S1B and S1C of supplemental Table S1. The proteins not assigned were either chicken-or egg-specific proteins. When only the non-redundant set of human orthologs was considered, 51% were either extracellular or membrane genes. This is in line with the percentage of cell surface and extracellular matrix proteins as found by Rybak et al., which ranged between 20% and 50% for all organs from an in vivo mouse biotinylation experiment (17). A total of 1157 non-redundant human orthologs was found for both CIKL and CTW combined. The CIKL and CTW pools contained 1052 and 670 non-redundant human orthologs, respectively.
Functional Enrichment for Protein Sets Detected through in Vivo Biotinylation-Gene Ontology analysis using DAVID on all proteins identified showed principally, for cellular compo- nents (p value Ͻ 0.001), enrichment in the extracellular matrix and extracellular region categories (Ͼ10% of all proteins) (Fig.  7). For biological processes (p value Ͻ 0.0001), the noted categories were metabolic process, primary metabolic process, cellular process, cell communication, signal transduction, protein metabolic process, developmental process, cell adhesion, immune system process, transport, system development, cell surface receptor linked, response to stimulus, and cell-cell adhesion (Ͼ10% of all proteins) (supplemental Fig.  S2A). For molecular function (p value Ͻ 0.0001), catalytic activity, hydrolase activity, receptor activity, structural molecule activity, transferase activity, and oxidoreductase activity were the most represented (Ͼ10% of all proteins) (supplemental Fig. S2B). Most categories are related to components and processes of the membrane and extracellular space in agreement with the in vivo biotinylation.
In the wound samples, we found 33 proteins that were overexpressed relative to the CAM (supplemental Fig. S3B). Finding Tissue Enriched Proteins-We furthermore conducted a statistical analysis of the results via Ward's hierarchical clustering method using R software (v2.12; R Commander V1.6-2) to generate clusters representing the expression profiles of the identified proteins over the different samples. For the CIKL group, 17 clusters were found. Among the 17 clusters, 4 were found to display overexpression in one tissue relative to the three others (Figs. 9A-9D). These four clusters depict liver, intestine, CAM, and kidney enriched proteins, respectively. Supplemental Table S3 depicts  clusters. In the following, four clusters exhibiting enrichment in one organ relative to the others are discussed.
In the first cluster (liver expression, Fig. 9A), 128 proteins were found. Among these were a number of extracellular matrix molecules, cell surface-associated molecules, and receptors such as CD36, cell adhesion molecule with homology to L1CAM, COL15A1, ESAM, FABP1, HAPLN1, LAMC3, LYVE1, NGFR, PCDH11X, and PTX3. We emphasize that LYVE-1 is highly expressed in liver sinusoids and was detected via our biotinylation method. Furthermore, several enzymes that may be found in the extracellular compartment were detected (e.g., ECE1, GGT1). It is of note that ECE1 has two activities, one intracellular and one extracellular at the liver plasma membrane. Also, BRP44 has been shown in gene expression analysis to be overexpressed in the liver (Gene Expression Atlas). Furthermore, many metabolic enzymes were detected. Among these, carnitine palmitoyltransferase 1A had specific annotation pertaining to the liver tissue.
In the fourth cluster (kidney, Fig. 9D), only 10 proteins were found. It is of note that the kidney-specific angiotensin-converting enzyme was detected in the kidney in our analysis.
Tumor-specific Peptides (Human Genes)-To identify proteins that were derived from the tumor, we screened for peptides that matched human proteins exactly but not chicken proteins (supplemental Table S5). Peptides with exact matches to only human proteins (49 proteins) were included in the list (Table II). Among these proteins were a number of extracellular/membrane proteins. These included proteins already associated with tumors, such as COL1A1, COL4A1, FN1, LOXL1, TGFBI, TNC, and VIM. Furthermore, annexins (ANXA1, ANXA2, and ANXA5), which are also membrane proteins, were detected. Additionally, several metabolic enzymes were detected, including glycolytic enzymes (aldolase A, fructose-bisphosphate; aldolase C, fructose-bisphosphate; enolase 2; GAPDH; LDHA; LDHAL68; PKM2; and TPI). It is of note that the tumor-specific PKM2 was detected in our analysis, which is an enzyme that contributes to the Warburg effect in tumor cells. Furthermore, heat-shock proteins (HSP90AB1, HSPA5, and HSPD1) and the antioxidant protein PRDX4 were also detected.
We next performed a literature search for glioma involvement of human-specific peptide genes. Keyword searches using "glioma" and "glioblastoma" revealed that 17 of the 48 genes had published evidence of glioma involvement, and these are listed with PubMed ID numbers in Table III. These results indicate good correlation of the identified proteins to known genes involved in glioma pathobiology.
Endothelial Cell Expression of Orthologs (EndoFactor)-The human orthologs of the chicken proteins were then analyzed for their endothelial cell expression (we call this their EndoFactor) using methods similar to those of Herbert et al., who developed an analysis of endothelial gene expression signatures (25). In this work, we combined cDNA libraries with a Roche 454 RNA-seq hypoxia liver endothelium library to identify endothelial-expressed genes within the proteomic data. For the full set of orthologs, an analysis was performed that matched gene symbols between orthologs and endothelial enriched genes. A criterion was used to rank the genes: (i) a gene was log2(FC) Ն 1 (2-fold) up-regulated in endothelial cells, or (ii) genes showed an endothelial specific profile by having zero counts in the non-endothelial pool. Two hundred and five genes with an enriched endothelial expression profile were found (supplemental Table S6, sheet S6A). Seventy-nine of these genes were membrane, 46  Angiogenic Screen and Possible Vascular Targets-The proteomic technique used here was preferentially employed to acquire membrane and extracellular proteins from a developing chicken embryo. Because angiogenesis is a process pivotal to both embryonic development and tumor growth, an angiogenic cDNA library screen was performed to compare tumor/fetal and normal bulk public tissue libraries to rank human orthologs as potential vascular targets. This was also combined with the endothelial cDNA/454 screen, and 136 genes had a positive FC in both endothelial and angiogenic screens (supplemental Table S6, sheet S6B). Several genes among these are ANGPT2, ANGPTL2, ANXA5, COL15A1, EPHB1, EPHB2, IGFBP7, IL31RA, JAM3, DKK3, MMP15, Note: This table lists glioma genes identified by peptide matching. Genes had peptides specific to human (glioma cells) with eight genes matching at least seven different human specific peptides. Ten genes had promiscuous but human-specific peptides (data not shown). PLXNA2, PLXNA4, PTX3, SULT1E1, SPARC, PXDN, COL4A1, NID1, and VIM. Some of these genes are already published as tumor endothelial markers/vascular targets. For instance, the genes COL4A1, SPARC, VIM, and IGFBP7 were reported by van Beijnum et al. as tumor angiogenesis genes (26), and the genes PXDN, PLXNA2, NID1, and DKK3 were reported as endothelial markers of brain tumors (27) or colon tumors (28). Therefore, in this list there could be other potential tumor endothelial genes.
Literature Scanning to Determine Angiogenesis Involvement-To provide further evidence regarding whether known angiogenesis-related genes were in fact being selected with the proteomic technique, angiogenesis and tumor angiogenesis keywords were searched against total publications for a gene to derive an "AngioScore" (the percentage of publications matching a word to the total number of publications) (supplemental Table S6, sheet S6C). Forty-three of these had an AngioScore Ն 50. The best AngioScores were found for KDR (94) (63), ANG (62), and TIE1 (62). It is of note that many of the classic, well-known endothelial genes or angiogenesis factors/receptors were present in both lists (EndoFactor and AngioScore). Furthermore, we also detected recently identified angiogenesis regulators such as LOXL2.
We also compared the percentages of proteins with positive EndoFactors or AngioScores for the different tissues tested (Table IV, supplemental Table S7). All proteins with a positive EndoFactor (EndoFactor log2(F) Ն 1) were included, and for the AngioScores, a cut-off of 50 was chosen. It is interesting that the percentages of proteins with endothelial cell signatures were relatively constant at a total of 20%. This indicates that one-fifth of proteins identified exhibit an endothelial cell expression signature.
Experimental Validation of Genes-We could not perform immunohistology on chicken tissue because of the lack of specific chicken antibodies for these proteins. Therefore, we analyzed the expression of some of the human proteins in tumors grown on the chicken CAM derived from human U87 glioma cells (Fig. 10).
The selection of tumor enriched proteins for validation was based on four techniques: (i) Peptides that matched human proteins exactly but not chicken were priority candidates, as they were of human origin. (ii) As tumor peptides can match both human and chicken genes, differential gene expression of proteins between tumor and CAM identified further candidate genes. (iii) Protein expression profiles of tumor, wound, and CAM samples were clustered using the Cluster Affinity   1295  670  49  Kidney  1090  191  43  Liver  1097  193  42  Intestine  1101  191  43  Tumor  664  106  30  Wound  670  114  31 Search Tool to derive tumor enriched candidate genes. (iv) Finally, public Unigene cDNA libraries, literature scans, and cancer databases were surveyed to rank the candidate genes. Following this strategy, 10 genes were chosen for validation, and these are listed in Table V. Immunostaining was performed by immunofluorescence staining for selected proteins associated to vessels or the extracellular matrix. We performed immunostaining for Col IV (a), FSTL1 (b), HRNR (c), LOXL2 (d), Plexin D1 (e), SIRPA (f), SRPX (g), Tenacin C (h), TGF-␤i (i), and fibronectin (j). These were selected for several reasons. FSTL1, HRNR, LOXL2, PLXND1, and SRPX are not known to be expressed in glioma, and SIRPA has low expression. TGF-␤i, human tenascin, and fibronectin had excellent peptide coverage and were well detected via immunolabeling. Blood vessels were stained using SNA-1 isolectin as previously described (14). Human proteins were detected in tumor cells, in the tumor matrix, or around blood vessels, but not in blood vessels. In addition, HRNR exhibited a vessel-associated staining pattern. These results are in agreement with the mass spectrometry data from the U87 xenografts obtained after in vivo biotinylation.
We then performed additional staining on human tumor samples from oligo-astrocytoma (supplemental Fig. S4). Tumor samples were stained with anti-FSTL1, anti-SIRPA, anti-HRNR, and anti-SRPX antibodies. We used sheep anti-CD31 for co-staining with mouse anti-COL IV. FSTL1, SIRPA, and SRPX were tumor and stroma associated without vessel staining. In contrast, HRNR was found associated to the vasculature, and for Col IV a vascular staining was observed.

DISCUSSION
Novel Approach-In this research, we applied in vivo biotinylation combined with high-resolution mass spectrometry and bioinformatic analyses to study the vascular and matrix proteome in the chicken embryo. This is the first time such an approach has been used in an embryonic organism. We provide, in addition, the first description of the vascular and matrix proteome in an embryonic organism. We performed our analysis not only on several embryonic organs, but also on tumor cells xenografted into the chicken CAM and granulation tissue induced by wounding. In contrast to previous techniques, we purified the biotynylated proteins by means of mild warm-water elution, which significantly reduced the background and prevented the overrepresentation of single proteins that would mask less well-represented proteins in the mass spectroscopy analysis. Also, proteins were deglycosylated, as described elsewhere (16), to increase the pool of peptides available for spectroscopy and proteomic fingerprinting.
There have been several publications reporting proteomic studies on the chicken embryo. These include, for example, studies on the chicken at stage 29 of development (29), embryo chicken vitelline membrane (30), the facial development of E3-E5 embryos (31), subcutaneous gel in avian hatchlings (32), the embryonic chicken retina (33,34), the chicken egg yolk plasma and granules (35), the chicken egg white (36,37), cerebrospinal fluid (38), embryonic chicken gonadal primordial germ cells (39), matrix vesicles from chicken femur (40), the calcified eggshell layer (41), and the chicken cardiovascular system (42). These studies mainly used standard proteomic techniques such as one-dimensional or two-dimensional gel electrophoresis of total proteins, often coupled with MALDI-TOF-MS. These techniques provide information on a global scale of proteins found in a total tissue extract, but not in tissue-specific compartments. Furthermore, these techniques cannot distinguish specific components of plasma membrane and extracellular matrix from intracellular proteins.
In contrast to these studies, our procedure involved the in vivo biotinylation of membrane or extracellular proteins associated with the vasculature and stroma. It is impossible to avoid also detecting intracellular or yolk proteins because of vascular leakage or cell and tissue injury after perfusion. Our technique is based on the method published by Neri and collaborators, who pioneered the detection and analysis of extracellular or cell membrane proteins via in vivo biotinylation in adult rodents (16,17). They have applied their technique to identify novel, potential therapeutic targets in tumors grown in mice. More recently, they have also applied their technique to human colon cancer via ex vivo perfusion (18). However, our procedure differs from theirs in several respects. Firstly, we removed biotinylated proteins from the column after binding to streptavidin not by means of trypsin digestion, but through the elution of water at 70°C, as it has been previously shown that water at 70°C is able to reversibly break the interaction between biotin and streptavidin (21). Secondly, by using this method and all the required controls, we showed that our proteins were indeed biotinylated and that this protein fraction was largely extracellular and membrane associated. This is different from trypsin-clipping on columns where only digested peptides, and not entire biotinylated proteins, are eluted from columns, thus making it impossible to determine whether proteins are biotinylated or not after elution. In addition, we also deglycosylated our biotinylated proteins in a manner similar to that reported by the Neri laboratory (43).
Enrichment of Extracellular and Membrane Proteins-A great number of proteins identified were found to belong to cellular components or biological processes of the extracellular space when a DAVID Gene Ontology analysis was performed. This is in agreement with the immunohistology, which evidenced significant in vivo biotinylation in the extracellular space around vessels and in the surrounding tissue. However, in the case of tumor vasculature, we detected significant quantities of intracellular proteins. This is most likely due to a high necrosis rate and, as a consequence, a release of intracellular components into the interstitium, which occurs in a fast-growing tumor. The intracellular proteins detected reflect the active metabolic demand and the Warburg effect that occurs in tumor cells. In particular, we found that PKM2, a gene, the percentage of total Unigene cluster expressed sequence tags (ESTs) expressed in brain for a gene, a Rembrandt database glioma enriched profile, total PubMed article abstracts, and the number of abstracts that contain one or more of the following keywords: glioma, glioblastoma, astrocytoma, and neuroblastoma.
which had one of the best peptide coverages, was specifically expressed in tumor cells and not in transformed cells (44 -46). Furthermore, LDH was also detected, which reflects active lactate production in tumor cells. Differential Expression-Differential protein expression analysis was performed via a label-free quantitative proteomic approach as described under "Experimental Procedures." We emphasize that our results led to identifications of proteins that are very characteristic for a given tissue. When protein clusters were analyzed for overexpression in a specific organ, the greatest number of proteins was found in the liver. This was expected, as the liver is an organ that has many biosynthetic and metabolic functions.
Liver Enriched Proteins-Most important, liver-enriched proteins found in this study have also been reported to be enriched in the liver in other transcriptomic studies and databases. For example, LYVE-1, detected in this analysis, is known to be highly overexpressed in liver sinusoids (47). FABP1 is found, as expected, to be enriched in liver. Fatty acid binding proteins are a family of small, highly conserved carrier proteins that bind long-chain fatty acids and other hydrophobic ligands. This protein can also be found outside the cell and is able to bind bile acids (48). It is thought that the roles of fatty acid binding proteins include fatty acid uptake, transport, and metabolism and are regulated by liver-enriched transcription factors HNF3␤ and C/EBP␣ (49). The fatty acid translocase CD36 was found, as might be expected, in the liver. Fatty acid translocase CD36 (FAT/CD36) mediates the uptake and intracellular transport of long-chain fatty acids in diverse cell types. It has a role in liver pathology, including hepatic steatosis and non-alcoholic fatty liver disease (50). The endothelial cell adhesion molecule (ESAM) is a 55-kDa type I transmembrane glycoprotein of the JAM family, which is a member of the immunoglobulin superfamily, that is found in endothelial cells. ESAM has been detected in the liver sinusoids and is responsible for leukocyte migration in the liver (51). ESAM also might be present on hematopoietic stem cells in the liver; evidence in support of this has been reported in murine fetal liver (52). Endothelin converting enzyme (ECE1) has been described in the liver and may be up-regulated after injury, inflammation, or allograft rejection (53,54). It has two activities, one intracellular and another extracellular at the liver plasma membrane. Brain protein 44 has been shown in gene expression analysis to be overexpressed in the liver (Gene Expression Atlas). LAMC3 is a component of basement membranes. It is also found in other tissues, such as skin, heart, lung, and the reproductive tracts, and is a prominent element of the apical surfaces found on ciliated epithelial cells. It has been detected in the retinal vasculature and is thought to play an important role in vessel development (55). LAMC3 has not yet been described in the liver but could be present in liver basement membranes or in the liver vasculature.
Intestine Enriched Proteins-The intestinal enriched cluster exhibited a number of proteins that have previously been described in the digestive system. For example, GPA33 is a transmembrane protein, found overexpressed in the digestive system and colon (Gene Expression Atlas Database).
The leucine-rich repeats and Ig-like domains 1 (LRIG1) gene is a transmembrane protein that interacts with ErbBs and promotes its degradation. It is a key player in the control of intestinal homeostasis. LRIG1 has recently been found to be enriched at the crypt base and in the progenitor compartment of the small intestine and colon. It limits the size of the intestinal progenitor compartment by dampening EGF/ErbBtriggered stem cell expansion (56).
Chromogranin B encodes a tyrosine-sulfated secretory protein abundant in peptidergic endocrine cells and neurons. This protein might serve as a precursor for regulatory peptides. Angiogenic properties of this molecule have also been described (57). This gene has been described as present in the small intestine of the chicken (58). The MAM domain containing glycosylphosphatidylinositol anchor 1 (MDGA1) is a transmembrane glycosylphosphatidylinositol anchored protein that has a role in the nervous system at the plasma membrane. It has also been found in the instestine and is down-regulated in inflammatory bowel disease. A protein classically found in the central nervous system that we identified in the intestine cluster is neuroligin 3. This protein encodes a member of a family of neuronal cell surface proteins. Members of this family may act as splice site-specific ligands for beta-neurexins and might be involved in the formation and remodeling of central nervous system synapses (59). It is interesting that this protein is also detected in the intestine. It might play a role in innervation of the gut.
CAM Enriched Proteins-The CAM cluster exhibited a number of proteins that are involved in adhesion or other extracellular functions. Annexins (ANXA2, ANXA1) were detected. These proteins have multiple functions such as Ca 2ϩ binding, interaction with phospholipids, and recognition of soluble factors such as CXCL12 (60). Among the matrix molecules found in this study, lumican and collagen 17 were overexpressed. SIDEKICK2 encodes a protein that is a member of the immunoglobulin superfamily. The protein contains two immunoglobulin domains and 13 fibronectin type III domains and might serve as an adhesion receptor (61). Another adhesion molecule is FAT2, which is an integral membrane protein characterized by the presence of cadherin-type repeats and plays a role in the control of planar cell polarity (62). We also detected CD99, which has been described as expressed in immune cells and is a key mediator of the transendothelial migration of neutrophils and dendritic cells (63,64). We found orosomucoid (alpha1-acid glycoprotein), which is needed to maintain the high capillary permselectivity required for normal homeostasis. It was detected in endothelial cells as well as in the liver. Follistatin-like 1, an activin-binding protein, was also found in our analysis. Finally, proteases and their inhibitors were also overexpressed in the CAM cluster (AMBP, SPARC, SPINT1, and SERPINB3).
Kidney Enriched Proteins-In the kidney cluster, only a limited amount of overexpressed proteins was detected. It is of note that angiotensin-converting enzyme was found in this group. Angiotensin-converting enzyme is synthesized in the kidney and is a major regulator of vessel tonus (65).
t Test of Tumor versus Wound-Several interesting observations were made when wound and tumor expression were compared using Volcano plots. Among the proteins overexpressed were several matrix molecules and receptors (ANAXA5, ITGB5, collagen triple helix repeat containing 1, MSLN, and VNN2). Furthermore, complement and coagulation factors, protease, and their inhibitors were detected (angiotensinogen, MMP2, F9, SERPINA10, SERPINA1, C3, F2, A2ML1, CFB, and KNG1). However, some carrier proteins, intracellular binding proteins, or enzymes were surprisingly highly overexpressed, such as NAMPT, OAS3, POMGNT1, and PDIA3. This most likely was due to a high necrosis rate, which leads to the release of intracellular components into the interstitium, as occurs in a fast-growing tumor.
NAMPT was the most tumor-enriched protein. NAMPT is the gene symbol of the enzyme nicotinamide phosphoribosyltransferase, which is important in the biosynthesis of nicotinamide-adenine-dinucleotide. It catalyzes the condensation of nicotinamide and 5-phosphoribosyl-pyrophosphate to produce nicotinamide mononucleotide and inorganic pyrophosphate, and it is the rate-limiting enzyme in the salvage pathway of nicotinamide-adenine-dinucleotide synthesis. In the literature, NAMPT has been shown to be up-regulated in colon cancer at both mRNA and protein levels (66,67). In addition, it has been shown to be up-regulated in glioma and has been proposed as a serum prognostic marker of glioblastoma (68). It has been shown that adipokine, the extracellular isoform of NAMPT, promotes angiogenesis and is a potent activator of in vivo neovascularization of the chick CAM (68). Furthermore, Notch1-dependent production of FGF2 is induced in NAMPT-primed endothelial angiogenesis. Several groups have shown that NAMPT/Visfatin can influence angiogenesis (69 -72). NAMPT has also been shown to induce proliferation and tube formation in human umbilical vein endothelial cells (73). Lastly, the compound FK866 has been reported to inhibit NAMPT and starve tumor cells that rely on nicotinamide-adenine-dinucleotide metabolism for energy. This is a potential therapy for cancer (74,75).
CALCRL, H1F0, EPB41, ANK1, and NMT were most overexpressed in wound versus tumor. CALCRL belongs to the adrenomedullin receptor complex that is involved in vasodilation and has been shown to be up-regulated in hypoxia, which, in conjunction with adrenomedullin, can enhance endothelial cell survival and migration. CALCRL has also an effect of increasing vessel permeability and can facilitate the migration of white blood cells (76), which suggests a feasible role for CALCRL in wound healing, as white blood cells are delivered to a wound to substantiate an immune response and provide macrophages that perform phagocytosis of necrotic tissue.
Retinoic acid (vitamin A) is also implicated in normal wound healing (77)(78)(79). H1F0 is up-regulated in mice in response to retinoic acid (80) and might constitute a possible link to wound healing and granulation tissue formation.
EPB41 has also been shown to have a pivotal role in wound healing in mice. It was demonstrated that EPB41 knockout mice show a phenotype of wound healing impairment (81). In the same study, it was also shown that keratinocytes with silenced EPB41 exhibit reduced adhesion, cell spreading, migration, and motility. In addition, DLG1 has a role in wound healing, and it is proposed that the transglutaminase EPB41 binds to DLG1 (82). Ankyrin 1 could play a role in wound healing by initiating endothelial cell proliferation and/or adhesion via its interaction with CD44v10 and IP3 receptors on endothelial cell lipid rafts (83).
It is interesting that the glycolysis pathway gene BPGM is up-regulated in wound tissue (84). NMT1 is an enzyme known to catalyze the attachment of special lipids onto proteins, which is a pivotal process in cellular growth. This process is also involved in the development of cells from the leukocytic lineage known to be essential for rapid cell proliferation.
Another gene significantly overexpressed in wound relative to tumor is FZD2. This gene is a member of a family that encodes seven-transmembrane domain proteins that are receptors for the wingless type MMTV integration site family of signaling proteins and encodes a protein that is coupled to the ␤-catenin canonical signaling pathway. This pathway plays a role in tissue regeneration and is regulated during these processes (85).
Forty-nine proteins detected in tumor samples were unambiguously of human origin and thus were derived from tumor cells. Many of these proteins were extracellular, matrix, or matrix-associated proteins. The highest peptide coverage (Ͼ2) was found for ANXA5, FN1, TGFBI, TNC, ANAXA1, A2, type I collagen (COL1A1), and integrin ␣5. This indicates that our biotinylation procedure in the chicken embryo allows the identification of tumor-derived matrix or matrix-associated proteins when human tumor cells are xenografted into the CAM. Besides matrix proteins, the high percentage of glycolytic enzymes reflects active tumor metabolism and the Warburg effect (PCCA; GAPDH; PKM2; aldolase A, fructose-bisphosphate; and LDH). Some cytoskeletal proteins and heat shock proteins were also detected. Our in vivo biotinylation method should detect exclusively extracellular and membrane proteins, but this was not the case. It is likely that cell death or injury led to extracellular deposits of intracellular proteins that were then detected via our in vivo biotinylation method.
Endothelial Cell Signatures-Endothelial cell data filtering of our proteomic data revealed that many well-known angiogenesis regulators or vascular associated proteins were detected in our proteomic analysis. It is interesting that the percentages of proteins with endothelial cell signatures were relatively constant at a total of 20%. This indicates that one-fifth of proteins identified exhibit an endothelial cell signature. Endothelial cell enrichment for proteins with the highest FC were  SLC12A2, PECAM1, CDH5, CALCRL, ELTD1, CD93, PODXL,  TEK, INSR, PTPRB, CD47, KDR, TT3A, endoglin, coxsackie virus and adenovirus receptor, ADAM metallopeptidase domain 17, and NRP1. We also used another bioinformatic data mining approach to identify genes implicated in angiogenesis (AngioScore). The highest AngioScores were observed for many angiogenesis regulators and receptors such as VEGFR2 (KDR), CDH5, TEK, PECAM1, VCAM, NRP1, ANGPT2, TIE-1 ANG, endoglin, ANGPTL1, and THBS1. It is interesting that recently validated angiogenesis regulators were identified here, such as LOXL2 (86). Through the combination of in vivo proteomics and a cDNA library angiogenic screen, potential novel tumor angiogenesis genes were identified, as evidenced by the presence of previously published tumor endothelial markers. In summary, we found that in this high-throughput proteomic study, both EndoFactor and data mining strategies adequately identified many proteins involved in angiogenesis. We emphasize that we detected many of the classical angiogenesis factors and receptors. These molecules are often expressed at lower abundance levels than matrix molecules and are subsequently more difficult to detect. This indicates that the biotinylation procedure we used in this study is sensitive enough to detect these proteins. Eighty percent of proteins, however, did not exhibit an endothelial cell signature. This indicates that the in vivo biotinylation will evidence a majority of proteins that are not strictly endothelial cell associated and are derived from other stroma or tissue compartments.
We performed immunostainings of 10 proteins detected in our study in glioblastoma-implanted CAM and human astrocytoma. For five of these proteins, there was no published evidence of an association with glioma. In particular, HRNR, which is found in regenerating, psoriatic, and healthy human skin and has a role in keratinocyte cornification (87), was detected in glioma cells and in association with tumor vessels. Thus, it might constitute a new glioma and vascular marker.
All in all, we describe in this article for the first time the use of the in vivo biotinylation method combined with high-resolution mass spectrometry and bioinformatic analysis in an embryo to study the vascular and matrix proteome. This technique allows the identification of a significant number of proteins associated with the vasculature, the stroma, and the extracellular matrix. The results also show that a number of classical angiogenesis regulators were identified; these are generally proteins that are found in lower abundance than extracellular matrix molecules. Furthermore, there is evidence of organ/tissue-specific distributions of a number of proteins. Several proteins in this list are known to be classically found in a given tissue, a finding that validates our proteomic analysis. This study represents also the first description of the proteome associated with the vasculature and the stroma in an embryonic organism. The study and the methodology described in our article might constitute a fertile ground for further developments. Up-scaling of the protein amounts will certainly allow an increase in the number of proteins detected, especially of those of lower abundance, which might still be of functional importance.