Discovery of Novel Cell Surface Markers for Purification of Embryonic Dopamine Progenitors for Transplantation in Parkinson's Disease Animal Models

Despite the progress in safety and efficacy of cell replacement therapy with pluripotent stem cells (PSCs), the presence of residual undifferentiated stem cells or proliferating neural progenitor cells with rostral identity remains a major challenge. Here we report the generation of a LIM homeobox transcription factor 1 alpha (LMX1A) knock-in GFP reporter human embryonic stem cell (hESC) line that marks the early dopaminergic progenitors during neural differentiation to find reliable membrane protein markers for isolation of midbrain dopaminergic neurons. Purified GFP positive cells in vitro exhibited expression of mRNA and proteins that characterized and matched the midbrain dopaminergic identity. Further quantitative proteomics analysis of enriched LMX1A+ cells identified several membrane-associated proteins including a polysialylated embryonic form of neural cell adhesion molecule (PSA-NCAM) and contactin 2 (CNTN2), enabling prospective isolation of LMX1A+ progenitor cells. Transplantation of human-PSC-derived purified CNTN2+ progenitors enhanced dopamine release from transplanted cells in the host brain and alleviated Parkinson's disease-related phenotypes in animal models. This study establishes an efficient approach for purification of large numbers of human-PSC-derived dopaminergic progenitors for therapeutic applications.

Midbrain dopaminergic neurons are a subset of the neuromodulatory neurons that reside in the basal ganglia, substan-tia nigra pars compacta, and ventral tegmentum areas that project axons to the dorsal striatum, prefrontal cortex, and ventral striatum regions, respectively. Loss of the nigrostriatal pathway is a pathological outcome of Parkinson's disease (PD); other pathological hallmarks include aggregation of alpha-synuclein (SNCA) and formation of Lewy bodies in affected neurons. In order to restore the degenerated neurons, fetal mesencephalic cells have been utilized in clinical trials (1,2). These studies suggest that transplanted cells can survive in the brains of patients with PD and resulted in amelioration of the disease symptoms (3,4). Pluripotent stem cells (PSCs) are another source of human-derived cells for cell replacement therapy in neurodegenerative diseases. There has been considerable progress in increasing survival as well as functional effects of transplanted dopaminergic neurons from PSCs (5)(6)(7). Transplantation of neural progenitor cells differentiated from human PSCs to the brain of an animal model of PD have resulted in their long-term survival and correct target innervation comparable to fetal midbrain tissue (8). Despite progress in safety and efficacy of cell therapy with PSCs, major concerns that remain unaddressed include the presence of residual undifferentiated stem cells or proliferating neural progenitor cells with rostral identity, which may cause overgrowth and tumor formation (8 -12). Contamination with serotonergic neurons could lead to graft-induced dyskinesia following the transplantation of cells in the host brain (13)(14)(15)(16). Various protocols and different batches of differentiated cells generated from the same protocol have shown biochemical differences, which leads to variability in transplantation outcomes. Therefore, standardized protocols to isolate and purify cells for transplantation are essential for consistent and reliable results.
Here we report the generation of a LIM homeobox transcription factor 1 alpha (LMX1A) knock-in a GFP reporter human embryonic stem cell (hESC) line that marks the early dopaminergic progenitors during neural differentiation. This reporter cell line was differentiated toward dopaminergic neurons followed by enrichment of GFP-positive cells in vitro. Purified cells exhibited expression of mRNA and proteins that characterized and matched the midbrain dopaminergic molecular signature. Differentiation of purified cells yielded a homogeneous population of dopaminergic neurons with functional properties that expressed marker genes. Comparative shotgun proteomics analysis revealed the presence of a specific protein expression pattern in dopaminergic progenitors and mature neurons. The combination of Western blotting and immunostaining approaches confirmed the results. Enrichment of membrane proteins in the purified cell population was a criterion to monitor dopaminergic progenitor cell purification.
The pKO-DTA backbone (a gift from the Eccles Institute of Neuroscience, John Curtin School of Medical Research, Australian National University) was used to construct the targeting vector. Right and left homology arms were amplified from Royan H6 genomic DNA as a template with the Expand High Fidelity PCR kit (Roche Diagnostics, Basel, Switzerland). We removed the ATG start codon from the end of the left arm and added one nucleotide to the start of the right arm. The right homology arm replaced the KpnI and XhoI sites in the vector. Then, eGFP was inserted into the NotI and SacII sites, and we cloned the left arm in the NotI site. Genomic PCR products and constructed clones were sequenced for possible errors in the PCR amplification process. We selected clones with no mismatch against NCBI human reference genome GRCh37.p13 Primary Assembly for electroporation. The final vector (12,451 bp) comprised of a 3,388-bp 5Ј homology arm, eGFP, loxp flanked by Neo resistance cassette, and a 2,824-bp 3Ј homology arm, which was introduced into the hESCs.
Genetic modification and electroporation were performed as previously reported (18). Briefly, feeder-free hESCs were dissociated into single cells with Accutase for 10 min at 37°C. A total of 10 ϫ 10 6 cells per 700 l of cold PBS were incubated with 20 g of linearized plasmid in the BlcI site for 5 min on ice. Cells were electroporated in 0.4-cm cuvettes (Gene Pulser Xcell modular electroporation system, Bio-Rad) with 250-V and 500-F constant parameters, after which cells were harvested by centrifugation and replated on mitomycinated MTK-Neo media (Australian National University). G418 (Sigma) selection was started after three days and continued for a period of 10 days (range: 50 -200 g/ml). To excise the neomycin cassette, we electroporated the cells with CRE recombinase plasmid transiently as described above. Targeted clones were identified by long PCR amplification from outside of the targeted region, and we kept the modified colonies for karyotype and differentiation analysis. We used the qPCR and reference controls as previously described (19) for copy number quantification in order to eliminate any random integration events in the established clones.
Immunofluorescence Staining-For immunostaining, cells were fixed with 4% paraformaldehyde (Sigma-Aldrich, P6148) for 20 min, after which their membranes were permeabilized by 0.25% Triton X-100 (Sigma-Aldrich, T8532) and blocked with 10% host serum of the secondary antibody and 1% BSA (Sigma-Aldrich, A3311). Cells were incubated overnight at 4°C with the primary antibodies (Table  S1) diluted in blocking solution. After three washes, cells were incubated with secondary antibodies (Table S1). Nuclei were counterstained with DAPI (Sigma-Aldrich, D8417) and analyzed with a fluorescence microscope (Olympus, IX71) and Nikon A1 confocal laser microscope system.
Real-time PCR-Total RNA was isolated using Trizol TM reagent (Invitrogen) from three independent biological replicates of cultured cells. cDNA was synthesized from 1 g of DNase I (Takara, 2270A)treated RNA using a cDNA synthesis kit (Fermentase, KI632) according to the manufacturer's instructions. Real-time PCR was carried out on a Rotor Gene 6000 (Corbett Research) in 10 l reactions that contained 5 l of SYBR® Premix Ex Taq™ (Takara, RR041) with 0.375 M of each primer and each sample run in duplicates for technical replicates. All primers used in these assays were tested for specificity and amplification efficiency (Table S2).
Protein Extraction and Separation by SDS-PAGE-Three replicates from hESC and differentiated samples (at least 5 ϫ 10 6 cells/sample) were used in the shotgun proteomics. Samples were washed twice with 5 ml ice-cold PBS. The samples were then centrifuged at 450 ϫ g for 5 min at 4°C. Next, we added 1 ml of the lysis buffer (Qiagen) that contained 1 unit of benzonase nuclease and 10 l of protease inhibitor (100x) to the cell precipitates, which were subsequently incubated at 4°C. Cells were disrupted by subjecting them to sonication (three times) on ice, each for 5 min (45-s pulses with 15-s intervals). The insoluble debris was then pelleted by centrifugation at 14,000 ϫ g at 4°C for 10 min. The protein concentration in the supernatant was quantified by the Bradford Assay Kit (BioRad, Hercules, CA, USA) using BSA as the standard. We treated the extracted proteins with sodium dodecyl sulfate (SDS) sample buffer (160 g per well). Proteins were separated on 12% bis-tris polyacrylamide gels at 100 V for 1 h and visualized using colloidal Coomassie blue. Finally, the gels were washed twice in water (10 min per wash), and the individual lanes were then cut into 12 slices of equal size from top to bottom.
In-gel Digestion by Trypsin-Each stained gel lane was cut into 16 pieces. Each piece was cut again and diced into smaller pieces then placed into individual wells of a 96-well plate. For destaining, the gel pieces were first briefly washed with 100 mM NH 4 HCO 3 , followed by washing twice with 200 l of 50% acetonitrile (ACN)/100 mM of 50% NH 4 HCO 3 for 20 min each time. The pieces were dehydrated with 100% ACN, air dried, and reduced with 50 l of 10 mM DTT/NH 4 HCO 3 (50 mM) at 37°C for 1 h. Finally, the samples were alkylated in the dark with 50 l of 50 mM iodoacetamide/50 mM NH 4 HCO 3 at room temperature for 1 h. Next, samples were briefly washed with 100 mM NH 4 HCO 3 and 200 l of 50% ACN/100 mM of 50% NH 4 HCO 3 for 10 min, dehydrated with 100% ACN, and air dried. Finally, samples were digested with 20 l of trypsin (12.5 ng/l of trypsin in 50 mM NH 4 HCO 3 ) on ice for 30 min then left overnight at 37°C. Peptide extractions were carried out for three times with 30 l of 50% ACN/2% formic acid. Extracted peptides were dried in a vacuum centrifuge and reconstituted into 10 l with 2% formic acid.
Nanoflow Liquid Chromatography-Tandem Mass Spectrometry-The resultant peptides from the SDS-PAGE gel slices were analyzed by nanoflow liquid chromatography tandem mass spectrometry (nanoLC-MS/MS) using an LTQ-XL ion-trap mass spectrometer (Thermo, Fremont, CA, USA). Reversed-phase columns were packed in-house to ϳ9 cm (100 m inner diameters) using 100 Å, 5 M Zorbax C18 resin (Agilent Technologies, CA, USA) and 5 M Zorbax C18 resin (Agilent Technologies Santa Clara, CA, USA) in a fused silica capillary with an integrated electrospray tip. A 1.8-kV electrospray voltage was applied via a liquid junction upstream of the C18 column. Samples were then injected into the C18 column using a Surveyor autosampler (Thermo, Fremont, CA, USA). The column was washed with buffer A that consisted of 5% (v/v) ACN and 0.1% (v/v) formic acid for 10 min at 1 l/min before each loading. The peptide elution was carried out with 0 -50% buffer B that consisted of 95% v/v ACN and 0.1% v/v formic acid over 58 min at 500 nL min Ϫ1 followed by 50 -95% buffer B over 5 min at 500 nl min Ϫ1 . The eluted peptides were then directed into a nanospray ionization source of the mass spectrometer. Spectra were scanned over the range of 400 -1,500 amu. Automated peak recognition, dynamic exclusion window set to 90s, and tandem MS of the top six most intense precursor ions at 35% normalization collision energy were performed using Xcalibur software (version 2.06; Thermo, Fremont, CA, USA).
Protein Identification-Raw files were converted to the mzXML format using the ReAdW program and processed through Global Proteome Machine software with version 2.1.1 of the X!Tandem algorithm (available in the public domain at: http://www.thegpm.org). For each experiment, 16 fractions were processed sequentially with output files for each individual fraction, and we generated a merged, nonredundant output file for protein identification with log (e) values less than Ϫ1. Tandem mass spectra were searched against Homo sapiens protein database compiled from NCBI (Refseq protein database containing 99,871 sequences as of September 2014) with the search parameters that included MS and MS/MS tolerances Ϯ2 Da or Ϯ0.2 Da and K/R-P cleavages and allowed for up to two missed tryptic cleavages. The database contained a list of common human tryptic peptide contaminants. Fixed modifications were set for carbamidomethylation of cysteine and variable modifications were set for methionine oxidation.
Functional Annotation-The gene identifier (GI) numbers of proteins were converted to Symbols and Entrez accession IDs using the bioDBnet biological database network (20). The lists of up-and downregulated proteins with relative GI ID for each comparison were then uploaded in DAVID (http://david.abcc.ncifcrf.gov/). Western Blot Analysis-Proteins (10 g total) from hESCs, LMX1apositive and -negative progenitors were electrophoresed in 12% SDS-PAGE (120 V for 1 h) using a Mini-PROTEAN 3 electrophoresis unit (Bio-Rad). The proteins were transferred to a PVDF membrane (Amersham Biosciences) by semi-dry blotting (Bio-Rad) using Dunn carbonate transfer buffer (10 mM NaCHO 3 , 3 mM Na 2 CO 3 , 20% methanol). Membranes were blocked for 1.5 h using western blocker solution (Sigma, W0138) and incubated overnight at 4°C with primary antibodies (Table S8). Membranes were incubated with the peroxidase-conjugated secondary antibodies, anti-rabbit IgG (1: 100,000, Sigma, A2074) as appropriate for 1 h at room temperature. Finally, the blots were visualized using ECL detection reagent (Sigma, CPS-1-120) and images were quantified using ImageJ software (NIH, USA).
Cell Transplantation and Behavior Analysis-All animal procedures received the approval of the Royan Institutional Review Board and Institutional Ethical Committee (approval ID: J/90/1397). For the in vivo transplantation studies, we used adult female Sprague-Dawley rats (200 -250 g). Animals were obtained from Razi Institute (Karaj, Iran) and maintained in temperature-controlled rooms with a 12/12 h light/dark cycle, 50 -55% humidity, and ad libitum access to food and water. The nigrostriatal dopamine pathway was partially lesioned in the host rats by an injection of 4 l 6-hydroxydopamine (3.5 g/l free base dissolved in a solution of 0.2 mg/ml L-ascorbic acid in 0.9% w/v NaCl) into the medial forebrain bundle at the following coordinates with reference to the bregma and dura: anterior/posterior (AP): Ϫ4.4 mm; medial/lateral (ML): Ϫ1.2 mm; and dorsal/ventral (DV): Ϫ7.8 mm; target base (TB): Ϫ2.4). After three weeks, we assessed the effect of the lesion on motor function according to assessment of motor impairments, implementing tests that examine for a side bias. We used an apomorphine-induced rotation test, apomorphine (subcutaneous (s.c.) at a dose of 0.25 mg/kg) was injected, and contralateral turns were monitored for a period of 40 min using automated rotameter. For spontaneous motor tests, we examined forelimb use during explorative activity; briefly rats were placed individually in glass cylinder, and the number of independent wall placements observed for the right forelimb, left forelimb, and both forelimbs simultaneously were recorded for 10-min periods. Lesioned animals were stratified across four groups (n ϭ 8 per group) according to values obtained for both the behavioral measures. Only rats that exhibited a mean net rotation of at least six full turns/min for apomorphine over 60 min and Յ25% use of the forelimb contralateral to the lesion in the cylinder test were included in the study.
One week after the behavioral pretesting (four weeks after lesioning), each animal received a total amount of 2 l of CNTN2 ϩ cells and 4 l of NCAM ϩ cells of cell suspension that contained the same cell numbers for unsorted cells, NCAM ϩ and CNTN2 ϩ cells, and fibroblast cells. We adjusted the cell numbers to 70,000 -80,000 cells/l. Two separate deposits of 1 l each were placed along each of two implantation tracts in the head of the striatum at the following coordinates with reference to the bregma and dura: AP: ϩ1.2/ϩ0.5 mm; ML: Ϫ2.6/Ϫ3 mm; DV: Ϫ4.5 mm; and TB: Ϫ2.4 mm). The capillary was left in place 5 min before withdrawal. The animals survived for 10 -12 weeks, during which we analyzed their motor performance behavior every two weeks. Immunosuppressive treatment in the form of daily intraperitoneal injections of cyclosporine A (Novartis; 10 mg/kg,) was administered throughout the experiment, beginning the day prior to transplantation.
Experimental Design and Statistical Rationale-For statistical comparisons, we used three independent differentiations from Royan H6 (P25-35) and H9 to dopaminergic progenitors (n ϭ 3), dorsal forebrain (n ϭ 3), spinal progenitors (n ϭ 3), and dopaminergic mature neurons (n ϭ 3). For qPCR comparisons, we ran two technical replicates along with our biological replicates to minimize the variation in each run. Relative mRNA levels were calculated using the comparative CT method (Delta Delta CT method) with glyceraldehyde 3-phosphate dehydrogenase (GAPDH) used as an internal control for normalization in REST © (Relative Expression Software Tool). Cell quantification for immunostaining images was performed using ImageJ and cell counter plugin for over 1,000 cells.
The p value and the Benjamini-Hochberg false discovery rate (FDR) were used to determine the significance of enrichment or overrepresentation of terms for each annotation [e.g. Gene Ontology biological process and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway]. These proteins were also selected and reloaded into Qiagen Ingenuity R Pathway Analysis (IPAR, Qiagen, CA, USA, www. qiagen.com/ingenuity) for further analysis using gene symbols as identifiers, along with fold changes and adjusted p values as observations with a cutoff of 35 molecules per network and 25 networks per analysis. For Gene Ontology categories of interest, normalized spectral abundance factors (NSAF) abundance data were summed to achieve the overall protein abundance change over time for biological process categories. Then, Gene Ontology annotation and relative protein abundance were plotted side by side for the up-and downregulated proteins for each of the comparison tests.
Quantitative MS analysis was performed on three biological replicates of hESCs, LMX1a-positive and -negative progenitor cells and their corresponding differentiated neurons. For each condition, we combined the Global Proteome Machine outputs from each of the three biological replicates to produce a single nonredundant/highconfidence shotgun proteomic analysis output. The final list contained only the proteins that are identified in all three biological replicates of at least one condition, with a minimum summed spectral count across three replicates of five. FDRs of identification were calculated using a reversed sequence decoy database: Protein FDR ϭ (#reverse proteins identified)/(total protein identifications) ϫ 100 and Peptide FDR ϭ 2 ϫ (#reverse peptide identifications)/(total peptide identifications) ϫ 100 (21,22). We used NSAF as previously described by Zybailov et al. (2007) to estimate protein abundance data (23) For each protein k in sample i, the number of spectral counts that identified the protein were divided by the estimated protein length. The protein length was determined by dividing the molecular weight of the protein by the average amino acid molecular weight. SpCk/Length values were normalized to the total by dividing by the sum (SpCk/Lengthk) over all proteins, which yielded NSAFi values for each sample i. When plotting or summarizing the overall protein abundance for a particular condition, we used the average of the NSAF values for all replicates as a measure of protein abundance. A spectral fraction of 0.5 was initially added to all spectral counts to compensate for null values and allow for log transformation of the NSAF data prior to statistical analysis (24).
Student's t-tests on log-transformed NSAF data were used to determine differentially accumulated proteins for each comparison tests. Proteins with t test p values Ͻ0.05 were considered to be significantly changed between the two conditions. Additionally, an analysis of variance (ANOVA) was also performed to identify proteins that changed in abundance among those proteins and were present and reproducible under all conditions. The analysis was performed on log-transformed NSAF data, and proteins with an ANOVA p value Ͻ0.05 were considered to show a significant change between the different experimental conditions.

Characterization and in Vitro Differentiation of LMX1A-GFP Reporter Human Embryonic Stem Cells (hESCs)-
We generated LMX1A-GFP knock-in cell line (Fig. 1B) by using electroporation to introduce linearized pKO-LA-GFP-RA-DTA constructs (Table S2, Fig. S1) into hESCs. Cells were subjected to antibiotic treatment with G418, and the resultant antibiotic resistant hESC clones were screened by PCR genotyping primer pairs specific to the desired targeted allele (Fig. S1). DNA sequencing further revealed that the flanking sequences of homology arm regions were identical to the predicted targeted allele in the two knock-in clones. qPCR-based assay results further indicated that only one copy of the GFP gene existed in each of the knock-in cell lines (Fig. S1B), which reflected the absence of randomly integrated copies of the donor vector.
LMX1A-GFP clones (Fig. 1B) differentiated into the dopaminergic neurons with the floor plate based midbrain DA neuron protocol (Fig. 1A) and depicted the reporter expression five days postdifferentiation (Fig. 1C) and maintained its expression throughout the differentiation time until day 12 (Fig. 1D). Gene expression analysis of mRNA levels revealed a higher expression of genes associated with DA progenitor cells 8 -12 days after induction, with the maximum expression of the reporter gene at day 8 (Fig. 1E). Cells isolated by FACS (Fig. S2A) were analyzed by qRT-PCR to confirm enrichment for expression of DA-specific transcripts compared with the negative control population. Our results demonstrated 32-fold enrichment for LMX1B and 64-fold enrichment for TH expression in the GFP reporter cells compared with the negative cells (Fig. 1F). Immunostaining analysis of FACS-sorted cells confirmed that a large extent of the cells that expressed GFP also co-expressed FOXA2 (91%Ϯ3.9), LMX1A (84%Ϯ4.7), OTX2 (93%Ϯ5.3), and CORIN (63%Ϯ6.7) proteins 24 h after sorting ( Fig. 2A and 2C, Fig. S2C). Further differentiation of the GFPpositive cells resulted in significant elongation of the axons and improved arborization (Fig. 2B). We observed that TH expression co-localized with GRIK2 (82%Ϯ8.7), PITX3 (82%Ϯ10), and MAP2 immunoreactivity (89%Ϯ5.4; Fig. 2D,  Fig. S2). Together, these in vitro data supported our hypothesis that the knock-in approach could be effectively used to purify a progenitor population of midbrain DA neurons.
Proteome Signature Analysis of LMX1A-GFP hESCs and Differentiated Cells Enabled Identification of DA-neuron-enriched Proteins-We performed shotgun proteomics analysis of FACS-purified cells for DA progenitors and mature neurons at days 12 and 30, respectively and for the negative population for both progenitors and mature neurons (n ϭ 3 independent differentiation for each step) (Fig. 3A). Supplemental Table S3 lists all raw data for the identified proteins in each sample, protein and peptide FDR, and all identified proteins. Overall, across all analyzed samples, we reproducibly identified 1,572 proteins with a protein and peptide FDR of less than a 1% threshold (Table S4). The data were considered as highly stringent and required no further filtering. The expression of 906 proteins demonstrated significant alterations among all samples, which were plotted in the heat map (Fig.  3B, Table S5). We assessed the dopaminergic progenitor cells and mature neuron-specific proteome profiles by comparing reporter-positive cells against hESCs and negative control cells. The protein fold changes were calculated and plotted as in the scatter graphs for the progenitor cells and neurons (Fig.  3C). Our results showed that 280 identified proteins were overrepresented in the LMX1A-positive progenitors (expression ratio Ն 2 and p Ն 0.05) compared with the negative control cells. There were 141 proteins that expressed mainly in the GFP-positive progenitors related to the stem cells. However, 139 proteins were highly expressed in positive progenitors that might have some expression in the hESCs ( Fig.  3D; also see Fig. S3). Additionally, 124 proteins were exclusively expressed in the positive progenitors related to the hESCs and might have expression in negative cells (Table S6). In the differentiated LMX1A-positive neurons, 132 proteins had increased expression compared with hESC samples, whereas 58 proteins had increased expression compared with the LMX1A-negative neurons (Fig. 3E, Table S6). Upregulated proteins identified in the LMX1A-positive progenitors are mainly involved in malate metabolism, apical protein localization, asymmetric protein localization, and vesicle targeting to, from, or within the Golgi, and axonal cargo transport. In addition, these proteins have also been implicated in biochemical pathways associated with neural differentiation processes such as TCA cycle and acetyl-CoA metabolism (Fig. 3E, Table S7). The biological processes that were mainly overrepresented in the LMX1A-positive neurons primarily associate with intracellular signaling cascades involved in Rac signaling, intracellular biochemical homeostasis, actin filament and microtubule formation, regulation of neurotransmitter release, and adult behavior (Fig. 3D and 3E).
Validation of Differentially Expressed Proteins-To corroborate the results obtained by the MS analysis, we subjected selected proteins to immunostaining and Western blot analysis. We sought to verify the abundance and localization of up-regulated proteins and their co-expressions with the LMX1A protein as a hallmark of progenitor cells. Immunostaining results for RBM14 and TAF15 (nuclear proteins), VIM (cytoplasmic protein), and CNTN2 (membrane-associated protein) confirmed that all assessed proteins co-expressed with LMX1A-positive cells in the unsorted population of progenitors ( Fig. 4A and 4B). Western blotting analyses of hESCs, positive LMX1A cells, and negative control cell lysates revealed changes that matched the results obtained from the MS analysis (Figs. 4C and 4D, Fig. S6). Specifically, we observed two isoforms of STAT3 protein in the hESC and negative control cells, while the positive cells expressed only the longer alpha isoform. Similarly, positive cells demonstrated much greater protein expressions for RBM14, TAF15, VIM, GSK3␤, APMAP, GAP43, PITX3, and CNTN2 proteins in LMX1A-positive cells compared with the LMX1A-negative cells (Fig. 4D). Although, we observed detectable expressions of these proteins in the negative and hESCs; GAP43, GSK3␤, and VIM were robustly expressed in the positive progenitor cells. We also investigated the localization of some of the identified proteins for possible cell surface association that could potentially serve as a suitable marker for DA progenitor cells (Fig. 5A). Immunofluorescence staining followed by microscopic examination confirmed cell surface localization for CNTN2, FLOT2, CALB2 (caltretinin), and APMAP proteins in the progenitor cells (Fig. 5B).

Isolation of DA Progenitors Based on the Cell Surface Proteins and Transplantation into the PD Animal
Model-For further illumination of the specificity of our differentiation method and its validation by the proteomics approach, we differentiated H9 hES cell line to the rostral and caudal neural progenitors (see the methods) and examined the expression of CNTN2 as one of the specific proteins that we found has a higher expression in DA progenitors. We found that both forebrain progenitors expressing LHX2 and PAX6 and caudal spinal neural progenitors expressing OLIG2 and HB9 had little expression of CNTN2, but the midbrain progenitors developed by our protocol had higher and specific expression  (Fig. S6) are presented. One replicate of each group from similar blot has been presented for newly identified proteins in the human embryonic stem cell (hESCs), LMX1A ϩ and LMX1A Ϫ progenitors at day 12 after differentiation, and (D) quantification of band intensity for each protein in three replicates of samples normalized with GAPDH protein expression and mean expression of each group compared with the expression of proteins in the hESCs. PITX3 protein was used as a positive control (n ϭ 3 independent experiments; mean Ϯ S.D.; NS: not significant, *p Ͻ 0.05, **p Ͻ 0.01, ***p Ͻ 0.001; Dunnett's test). determined by double staining with FOXA2 and OTX2 (Fig.  5C). We decided to isolate the progenitor cells based on their membrane protein expression. We tested flotilin, calretinin, NCAM, and APMAP antibodies and three different antibodies for CNTN2. Most of commercial antibodies available for these proteins did not have enough specificity for the FACS system and failed to generate reliable results (the consistency that we defined here was as a coefficient of variation less than 25%), except for one CNTN2 antibody (Abcam, ab133498) and NCAM antibody (Figs. S4 and S5). Antibodies against these proteins depicted acceptable consistency for purifying progenitors from the mixed culture of progenitors. Immunostaining for pluripotency markers OCT3/4 and NANOG in day 12 progenitors sorted with CNTN2 and unsorted showed that these cells retained OCT3/4 expression at a lower level, but both populations were negative for NANOG expression, indicating elimination of residual undifferentiated embryonic stem cells during the differentiation process (Fig. S2).
Temporal analysis of CD24 (cell marker for early neural progenitors) and CD56 (NCAM) protein expressions with (B) Immunohistochemistry analysis of 6-OH dopamine semi-lesioned rat brain at four weeks after lesion for TH to mark DA neuron terminals in the striatum. (C) Normalized apomorphine-induced rotation at 10 weeks after transplantation FACS indicated that, over differentiation time for the population of CD24ϩ cells decreased from 84% at day 9 to 18% at day 21, and population of NCAM-positive cells increased gradually from 14% at day 9 to 84% at day 21 (Fig. S5A), consistent with proteomics data. CD56-positive cells were isolated from the culture at day 21 of differentiation and transplanted into the striatum of adult rats pretreated unilaterally with 6-OHDA. Animals receiving CD56ϩ sorted cells (n ϭ 15) were compared with intact control animals (n ϭ 15) and lesioned animals without cell administration (n ϭ 10). All animals received daily cyclosporine treatment for immunosuppression and minimized tissue rejection until rats were euthanized at the end of week 12. Animals grafted with CD56ϩ cells showed significant recovery in rotation behavior 6 -12 weeks after implantation, and only in this group there was a significant improvement in contralateral forelimb use over the lesion control animals at the 8 -12 week time points. (ANOVA with Tukey post hoc test; p Յ 0.01; Fig. S5B).
We also surgically transplanted CNTN2-positive cells isolated from the culture after 12 days of differentiation into the striatum of adult rats pretreated unilaterally with 6-OHDA ( Fig.  6A and 6B). The animal groups were separately transplanted with differentiated hESCs using three different cell types: CNTN2 sorted cells (n ϭ 15), total unsorted cells (n ϭ 15), control fibroblast cells (n ϭ 10), and vehicle group without cells (n ϭ 10). All animals received daily cyclosporine treatment for immunosuppression and minimized tissue rejection until rats were euthanized at the end of week 12.
Animals grafted with both sorted and unsorted progenitors showed significant recovery in their rotational behavior test at 10 weeks after cell transplantation as well as dopamine release caused by reversed apomorphine-induced rotational asymmetry in 6-OHDA-lesioned rats (ANOVA with Tukey post hoc test; p Յ 0.01; Fig. 6C). However, we observed faster recovery, at the end of eight weeks, in the animals that received the sorted cells (Fig. 6C). We observed significant improvement in contralateral forelimb akinesia over the lesion control for CNTN2-sorted progenitors that was significantly more when compared with the unsorted population (Fig. 6D). Histological analysis of brain sections revealed that all grafted animals, except those implanted with fibroblast cells, had transplants that survived and stained positive for humanspecific proteins (Fig. 6F). None of the grafts showed signs of any tumor formation or cell overgrowth. There was a signifi-cant differentiation of transplanted progenitors in both sorted and unsorted cells in the TH-and DAT-positive neurons. There were also a significant number of CNTN2-positive progenitors that expressed TH and DAT proteins compared with the unsorted cells ( Fig. 6E and 6G). DISCUSSION Transgenic human PSCs, as a potential therapeutic platform, are being increasingly explored in order to obtain an enriched population of differentiated cells in vitro (25)(26)(27). Among various transgenesis methods, the knock-in approach to introduce reporter genes in cell lines holds the promise of precise gene expression with minimal positional effects or disruption of other genes caused by random integration. We used the LMX1A locus as a crucial transcription factor to develop mesodiencephalic dopaminergic neurons (28) to mark the early stage of DA progenitor development and characterize proteins that are exclusively expressed in these cells. Overexpression of LMX1A in the human pluripotent cells increased differentiation of these cells to the DA neurons (29), and we previously reported that addition of TAT-permeable LMX1A recombinant protein in differentiation culture increased hESCs-derived differentiated DA neurons (7). The transgenic heterozygous cells prepared in this study had normal growth and karyotyped as normal hESCs and successfully differentiated to DA neurons. The LMX1A gene had decreased expression; therefore, we detected reporter expression using immunostaining approaches during initial days of differentiation (starting from day 3), which reached the maximum expression by day 8 after induction. These observations substantiated previous reports and were consistent with other studies that elucidated LMX1A expression during embryonic development and human PSC differentiation (30 -32). The cells that were positive for GFP expression mainly expressed floor plate proteins FOXA2 and CORIN and dopaminergic-specific protein PITX3, which was reminiscent of authentic midbrain dopaminergic neuron progenitors.
Although robust methods have been introduced that produce enough modified cells, uncertainty remains for selecting the right cell types from human pluripotent cells for transplantation in terms of heterogeneity (33). There are only limited studies that have focused on selecting DA neurons for transplantation. One strategy facilitates isolation of a homogenous population of predefined cell types by sorting target cells in vehicle and animals transplanted with CNTN2 ϩ cells, unsorted cells, and fibroblast cells; rats in the vehicle and grafts of fibroblast cells showed no reduction in drug-induced rotations at any of the analyzed time points. (D) CNTN2 ϩ grafted animals displayed significantly more paw touches than lesion control rats and unsorted cells when tested in the cylinder at four weeks after transplantation and afterward. The animals that received unsorted cells showed no significant difference with the control animals. (E) Quantification of the numbers of TH ϩ and DAT ϩ cells in the transplanted CNTN2 ϩ sorted versus unsorted cells in the grafted region of PD animals. (F) Immunohistochemistry analysis of the grafted region with human-specific nuclei (hNu) and human-specific cytoplasmic marker (stem 121) and (G) immunohistochemistry for human-specific TUJ1, TH, and DAT antibodies for engrafted human cells in the rat brain. Both CNTN2 ϩ sorted cells and unsorted cells were positive for TH and DAT immunostaining. All data are presented as mean Ϯ S.E. *p Ͻ 0.01 (CNTN2 ϩ sorted cells versus unsorted cells), **p Ͻ 0.001 (CNTN2 ϩ sorted cells versus unsorted cells and lesion control), ***p Ͻ 0.0001 (CNTN2 ϩ sorted cells versus unsorted cells and lesion control).
based on their membrane proteins or glycans; however, this kind of method needs an in-depth understanding of the glycobiology or membrane proteins to be used as markers. Unfortunately, this information is not available for DA progenitors. Corin (atrial natriuretic peptide-converting enzyme) has previously been used as a brain floor plate marker to mediate sorting of the DA progenitors in a mixed culture of neuronal progenitors (34 -36). The results have suggested that only 40% of sorted cells expressed TH and Nurr1 as the midbrain DA markers (35).
In a more recent study, researchers used 312 annotated antibodies and screened for antibodies that specifically enriched FOXA2-positive population. They found integrin-associated protein marks the FOXA2-positive cells (38). Our pure culture of TH neurons (Ͼ80% for TH and GIRK2 doublestained neurons) coupled with a shotgun proteomics approach (39) enabled us to identify proteins with higher abundancy in DA progenitors and neurons. In this study, we found several novel transcription factors that co-expressed with LMX1A positive cells such as RBM14 and TAF15, along with expression of TFs as previously reported in the dopaminergic areas of the brain such as XPO1 (40), FOXK1 (41), and PHF6 (42). The positive cells demonstrated a higher expression of TARDBP, a nuclear protein in which various mutations have been identified in fronto-temporal dementia, amyotrophic lateral sclerosis, Alzheimer's, and PD (Table S7) (43). We examined the proteins that were exclusively localized to the plasma membrane and identified contactin 2, calbindin 2, flotillin 2, adipocyte plasma membrane associated protein, and neural cell adhesion molecule 1. We confirmed their expressions in our progenitors (Figs. 4 and 5, Table S8). Ganat et al. used the transgenic reporter system for HES5, NURR1, and PITX3 genes to purify dopaminergic progenitors and compared transcriptome data obtained from cells enriched from these three reporters (25). We identified CHRNA6, CHRNB3, Gucy2C, and Rit2 as highly expressed integral membrane transcripts (25), although GUCY2C and CHRN proteins predominantly expressed in the adult DA neurons and are not exclusively representatives of progenitors (44). We did not find any previously described membrane-associated transcripts in our proteomics study. This discrepancy could partly be attributed to our technical limitation of protein detection with lower expression. However, differences in developmental stages of the cells likely played a major role in altered expression of various proteins compared with the former studies. We found that PSA-NCAM was enriched in our proteomics data, and isolation of progenitors based on this membrane protein enriched for dopaminergic markers and transplantation of NCAM-positive progenitors could recover symptoms in PD animals ( Fig. S5B and S5C). However, we observed clear reduction of rotation and increase in use of forelimb after NCAM-positive cell transplantation; we could not claim these cells are all dopaminergic and could be any type of neurons as reported by others, suggesting that membrane-associated proteins such as PSA-NCAM and ALCAM are not mDA progenitors specific and are expressed in other neuronal progenitors (36,37). Therefore, we mainly focused on CNTN2 in this study as a more specific protein marker for DA progenitors. We have further tested isolated CNTN2 ϩ progenitors to evaluate their functionality in vivo and validate our approach in finding reliable membrane proteins to purify authentic DA progenitors. Significant improvements in contralateral touches in CNTN2 ϩ transplanted cells and consistent decline of drug-induced rotation in model animals are both indications of dopamine release from transplanted cells in the host brain. Although we noted no significant improvement in motor performance for the unsorted cells group, these cells could also be beneficial as sorted progenitors in terms of reducing rotation in the animals and dopamine production (just like NCAM-positive cells). Different performance of the sorted and unsorted cells in the two PD animal model assays could be due to the fact that motor impairment tests have different sensitivity, and drug induced rotation is more sensitive to the TH cell loss in the substantia nigra (SN) of animals than the spontaneous motor test with forelimb asymmetries in cylinder test (45). Indicating that apomorphine induced rotation more delicately reflected the number of integrated TH neurons in the lesion side, but in the cylinder test, more TH neurons integration needs to reflect in the forelimb asymmetries test. In our study, in both groups of animals that received transplants with sorted and unsorted cells, we had enough THpositive neurons, which was probably sufficient for recovery and a decreasing number of rotations over time (as this test is more sensitive). However, in the forelimb asymmetries in the cylinder test, we just observed recovery in a group with more TH cells (CNTN2-sorted group) and small improvement in the other group. Therefore, we conclude that, even though we transplanted the same number of cells in both groups, the CNTN2-positive cells contain more dopaminergic progenitors, and it reflects as an improvement in the forelimb asymmetries test. Future studies in alternative animal models such as nigrostriatal bundle lesion models could be used to validate the beneficial effects of unsorted cells on behavioral outcomes (46). Our novel results that delineated significantly more TH/DAT double positive neurons in isolated CNTN2 ϩ progenitors compared with unsorted cells further support the notion that purity of transplanted cells might be a more critical parameter to achieve recovery of motor abilities compared with the number of transplanted cells.