Comparison of Gene Expression Profile in Embryonic Mesencephalon and Neuronal Primary Cultures

In the mammalian central nervous system (CNS) an important contingent of dopaminergic neurons are localized in the substantia nigra and in the ventral tegmental area of the ventral midbrain. They constitute an anatomically and functionally heterogeneous group of cells involved in a variety of regulatory mechanisms, from locomotion to emotional/motivational behavior. Midbrain dopaminergic neuron (mDA) primary cultures represent a useful tool to study molecular mechanisms involved in their development and maintenance. Considerable information has been gathered on the mDA neurons development and maturation in vivo, as well as on the molecular features of mDA primary cultures. Here we investigated in detail the gene expression differences between the tissue of origin and ventral midbrain primary cultures enriched in mDA neurons, using microarray technique. We integrated the results based on different re-annotations of the microarray probes. By using knowledge-based gene network techniques and promoter sequence analysis, we also uncovered mechanisms that might regulate the expression of CNS genes involved in the definition of the identity of specific cell types in the ventral midbrain. We integrate bioinformatics and functional genomics, together with developmental neurobiology. Moreover, we propose guidelines for the computational analysis of microarray gene expression data. Our findings help to clarify some molecular aspects of the development and differentiation of DA neurons within the midbrain. Citation: Greco D, Volpicelli F, Di Lieto A, Leo D, Perrone-Capano C, et al. (2009) Comparison of Gene Expression Profile in Embryonic Mesencephalon and Neuronal Primary Cultures. PLoS ONE 4(3): e4977. doi:10.1371/journal.pone.0004977 Editor: Arto Urtti, University of Helsinki, Finland Received August 2, 2008; Accepted February 26, 2009; Published March 23, 2009 Copyright: 2009 Greco et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: This study has been funded by the Institute of Biotechnology, University of Helsinki (Finland) and by COFIN MIUR/PRIN 2007 (Italy). Dario Greco was funded by the Suomen Kulttuurirahasto (Finland) and by the Ehrnrooth Foundation (Finland). Damiana Leo was recipient of a FEBS summer fellowship. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. * E-mail: dario.greco@helsinki.fi ¤ Current address: Department of Pharmacology, Faculty of Medicine, University of Montreal, Montreal, Canada


Introduction
In the mammalian brain, dopaminergic (DA) neurons are mainly located in the ventral midbrain (mesencephalon, Mes) in which they are arranged in three distinct nuclei: substantia nigra (SN, A10), the ventral tegmental area (VTA, A9) and the retrorubral formation (A8). Neurons originating in the SN project abundantly to the dorsolateral striatum, forming the nigrostriatal pathway. DA neurons of the VTA project mainly to the ventromedial striatum, nucleus accumbens and frontal lobe, forming the mesocorticolimbic pathway. Although midbrain dopamine neurons (mDA) are relatively few (20000-40000 in the rodent), they play an important role in regulating several aspects of basic brain function. Alterations of development or survival, or impairment in DA signalling are involved in a variety of behavioural, movement, and psychiatric disorders. Specifically, nigrostriatal pathway has been implicated in Parkinson disease [1] and Huntington disease [2], as well as in drug abuse toxicity [3]. Instead, alterations in VTA outputs are involved in schizophrenia [4], depression [5], attention deficit hyperactive disorder [6] and addiction [7].
Analyses of mouse mutants defective in mDA development have highlighted several transcription factors contributing the specification of the neurotransmitter identity [8][9][10], as well as neuronal identity and maintenance [11,12]. The molecular environment surrounding the mDA neurons plays an important role for their differentiation. Cooperative signalling by Sonic Hedgehog (SHH) from the floor plate and fibroblast growth factor (FGF) 8 from the isthmus induces mDA [13][14][15][16].
The mDA neuronal primary cultures represent a valuable tool to investigate the molecular mechanisms involved in the development and maintenance of these neurons. The expanded mDA cultures are generated from E11.5 rat ventral Mes, when a large number of mDA precursors are present [17]. Previously, we have demonstrated that the addition of the FGF 2 (also known as basic FGF) from the beginning of the culture in serum-free medium induces neuroblasts proliferation [18,19]. Furthermore mDA differentiation is increased in the culture treated with SHH and FGF8. As in vivo, also in vitro these factors are the inductive signals that specify mDA phenotype [14,19,20]. Markers of differentiated mDA neurons are observed in primary cultures when FGF2, SHH and FGF8 are withdrawn after six days in vitro (DIV) and ascorbic acid is added. Particularly, TH immunostainings show a high number of mDA neurons. In fact at this time in vitro, the number of TH + cells is increased about 20 folds when compared to 3 DIV cultures [19] and comparable to the number of mDA neurons present in primary cultures with or without serum plated at a 20-fold higher cell concentration [21,22]. These expanded cultures also show the presence of both neuroblasts, as shown by nestin-immunoreactivity, as well as more mature neurons. At least 90% of the expanded cells are nestin-positive, and at least 70% are medium neurofilament-positive (NFM). Thus there is a co-localization of nestin and NFM in at least 30% of the cells. Cells expressing glial markers, such as the glial fibrillary acid protein (GFAP), are extremely rare (on average one or two GFAPpositive cells/well at 9 DIV) [19,23]. Glutamatergic and GABAergic neurons are also present, as indicated by the levels of the glutamic acid decarboxylases (GAD65, GAD67 and EP10) and glutamate transporter (EAAT1) mRNAs. Instead serotonergic and noradrenergic neurons are absent, as indicated by lack of 5HT (serotonine transporter, SERT and tryptophan hydroxylase, TrpH) and NE (noradrenaline transporter, NET) markers, respectively [23]. In these cultures, mDA neurons are well differentiated since all molecular mDA markers, such as tyrosine hydroxilase (TH), (Dat), Nr4a2 (also known as Nurr1), and vesicular monoamine transporter 2 (Vmat-2), tyrosine kinase receptor Ret (Ret), GDNF family receptor alpha 1-2 (GFRalpha 1-2) are expressed [17]. Moreover, they also show mature functional signature since high affinity uptake, a marker of mature DA function, is present at 9 DIV. The latter is specific since it is blocked in the presence of selective dopamine uptake inhibitors but not by serotonergic and noradrenergic uptake inhibitors [23]. However, there is a lack of general picture concerning the global gene expression program of these cultures as compared to the tissue of origin.
We have carried out extensive gene expression survey of rat E11.5 mesencephalon (MesE11) and Mes primary culturesPC (MesPC) at 9 DIV generated from it. For this, a microarray experiment using the Affymetrix GeneChips RAE230A has been carried out. Because of the inaccuracies in the original annotation, we have performed the computational analysis by re-annotating all the probes present on the RAE230A chipset according to three major sequence databases, such as Entrez Gene [24], RefSeq [25], and Ensembl gene [26]. The results have been then combined in order to create a robust set of result for further analysis. Finally, by using computational techniques of gene networks and promoter sequence analysis, we have uncovered possible regulatory modules responsible for the expression of multiple genes both in MesE11 and MesPC.

Results
In order to characterize the expanded mesencephalic neuronal primary cultures and their expression profiling, as well as to identify new genes involved in mDA neurons differentiation and maturation, we performed the microarray analysis. To fulfill our aim we used three independent re-annotation systems based on the alignment of each oligonucleotide probe present on the RAE230A chipset against the Entrez Gene (EG), RefSeq (RS), and Ensembl gene (ENS) databases. Based on the new identities for probes we performed three independent analyses abbreviated from now on as EG, RS, and ENS. According to the original annotation released by the manufacturer, the RAE230A chipset contains 15923 probe sets; after re-annotation the probes were re-arranged in 8676 (EG), 13224 (RS), and 7600 (ENS) probe sets. After moderated t-test, we obtained 1016 genes in EG (11.7% of all the screened genes), 1568 (9.8% of all the screened genes) genes in RS, and 862 (11.3% of all the screened genes) genes in ENS to be differentially expressed between MesE11 and MesPC with p-value,0.001 (Table 1). We annotated the gene lists using DAVID which is a gene-centered database where gene entities from several databases are uniquely stored [27]. We took advantage of this system for parsing the gene lists from the three analyses. As many as 987 genes from EG, 1038 genes in RS, and 567 genes in ENS respectively had a reliable DAVID identifier. Of these, 425 unique genes were shared between the three annotations ( Figure 1). The RefSeq-based set showed the largest overlapping with the other two annotation systems. From the set of 425 common genes, 268 genes were upregulated in MesPC, and 157 were upregulated in MesE11 (Table S1). Functional classification of these groups of genes was carried out with the DAVID-based Fisher's exact test. Overall, the three annotations produced very consistent fold change estimation for all of the 425 common genes (average standard deviation = 0.04).
Microarrays are capable to observe changes in the expression of transcripts providing no explanation on how this is modulated within the cells. Gene transcription is also regulated by proteins that recognize short DNA sequence motifs, called transcription factor binding sites (TFBSs). TFBSs are in most cases located in the promoter regions of the genes. Similar TFBSs patterns within the promoters of transcripts are expressed in the same tissue under similar conditions. Thus, the organization of promoter motifs represents a framework of the regulatory mechanisms in a specific biological context. The smallest entities on the level of TFBSs combinations are called promoter modules. These are defined as two or more individual elements that act coordinately and are similarly arranged in the promoters of co-regulated transcripts.
We investigated the interactions of the MesPC genes based on the PubMed co-citation and the presence of TFBS in their promoter sequences. From this analysis, a number of genes In the columns are shown the three re-annotations used for analyzing the dataset (Entrez Gene (EG), RefSeq (RS), and Ensembl gene (ENS)). The absolute number of genes analyzed (a), found significant (b), and with an annotation in DAVID database (c) is shown. In parenthesis, the percent of all the genes analyzed is also shown. doi:10.1371/journal.pone.0004977.t001 emerged to have a specific consensus binding sequence for EGR1 transcription factor in their promoters (PCR validation of the microarray results shown in Figure S1). By multiple alignments, we found a conserved module constituted by a binding site for EGR1 and a second site for SP1 ( Figure 2). Next, we searched for this module in the whole set of known promoter sequences in Rattus norvegicus, locating it in the promoters of 659 genes. Geneontology classification showed significant over-representation of such families as neuron differentiation (34 genes), neurogenesis (37 genes), and neuron development (26 genes), suggesting a central role of EGR1-SP1 module during production and differentiation of neurons (Table S3).

Genes upregulated in mesencephalon at E11
We found 157 genes to be significantly over-expressed in MesE11 as compared to MesPC. These genes covered a variety of cellular and molecular functions, such as developmental process (43 genes), synaptic transmission (17 genes), nervous system development (16 genes) and neurogenesis (9 genes), and ion channels (9 genes). Interestingly, genes involved in oxygen transport (5 genes) and iron binding (11 genes) were also in this group (Table S4). As for the MesPC genes, we have inferred MesE11 gene network based on the PubMed co-citation and the presence of TFBS in their promoter sequences. This revealed a possible role for members of transcription factors of the NEUR and NR2F families in regulating the expression of many genes upregulated in MesE11. Particularly, the genes encoding respectively Neurod3 (NEUR family) and Nr2f2 (NR2F family) were significantly over-expressed in MesE11 ( Figure 3). We also scanned the whole set of known Rattus norvegicus regulatory sequences, finding 726 promoters with the NEUR-NR2F module.

Dopamine-related genes
The Genomatix software Bibliosphere allows to search for genes that are co-cited in the PubMed abstracts with biological themes. By this approach, we have created a catalog of 1339 Rattus Norvegicus genes related in literature to dopamine. Of these, 1032 were present on the re-annotated Affymetrix chipset 230A. A total of 84 genes were found differentially expressed in all the three datasets (Table 2), when comparing MesPC (46 genes) and MesE11 (38 genes). Interestingly, 18 dopamine-related genes overexpressed in MesPC are described as involved in cell differentiation; a subgroup of this, composed of 8 genes, is involved in neuron differentiation. Five dopamine-related MesPC genes are also associated with neurodegenerative diseases. The dopaminerelated genes over-expressed in MesE11 better resampled neurophysiologic events, as they were found associated to synaptic transmission (9 genes), transmission of nerve impulse (9 genes), and behavior (7 genes). Further, we have tested the hypothesis that the transcriptional models inferred from the MesPC and MesE11 gene networks would have a potential role in the regulation of some dopamine-related genes inferred from the literature scan. This seems to be the case, as 283 (,21%) dopamine-related genes presented the EGFR-SP1F module in their promoter regions. This group was enriched in genes involved in neuron differentiation. Similarly, the promoters of 132 (,10%) dopamine-related genes showed the ability to bind the NEUR-NR2F module. This group was enriched in genes involved in neurophysiologic processes, such as transmission of nerve impulse and synaptic transmission.

Discussion
Model organisms are widely used in biomedical research elucidating mechanisms which would be impossible to experiment on using human samples. Mice and rats are often regarded as optimal choices for working on the mammalian CNS. However, Rattus norvegicus genome has been annotated much less in detail, as compared to the Mus musculus genome. In this paper we utilized up-to-date methods for annotating rat arrays with the best possible accuracy. We have used the transcriptional profiling for comparing tissue samples from rat brain to primary cells from the same origin. We studied the effect of re-annotation and at the same time elucidated how well primary cells resemble the original tissues in the level of transcriptional profile. We have extensively investigated the gene expression in rodents' mesencephalon at E11.5 and neuronal primary cultures after 9 DIV, derived from it. For this, we have carried out a microarray experiment using the Affymetrix GeneChips RAE230A for the Rattus norvegicus genome. Because of the design inaccuracies and of the fast speed of updating information concerning the genes and transcripts sequences, many Affymetrix probes are known to have severe design problems as such. Particularly for the chipset RAE230A, several probe sets contain probes with multiple genome hits (13.2%), with no known target (3.6%), with allele-specific probes (19.5%) [28]. Currently, several re-annotation methods are available allowing the probes to be mapped to genes, transcripts, or even exons sequences stored in public databases. However, exon-based re-annotation leads to decreased precision and increased variance in estimating gene expression, probably due to the smaller number of probes that map to each exon [29]. Moreover, due to the fact that the majority of the probes are designed ignoring splicing variances, we find more convenient to work with gene-based rather than transcript-based re-annotations. During the re-annotation process, each single oligonucleotide probe is re-assigned to the correct gene. However, some probes are eliminated, because they don't reliably recognize any transcript, they have been designed for matching the antisense sequence of a given transcript, or they are designed in allele-specific regions. For instance, only the 53.6% (EG), 60.5% (RF), and 44% (ES) of all the RAE230A probes can be utilized for re-annotation. Long lists of differentially expressed candidate genes are usually produced from microarray analysis. However, they cannot be considered as the end point of the analysis but rather as the starting point of a more meaningful interpretation, by taking advantage of the increasing knowledge about the functions of the genes within the cells. The annotation of the genes or transcripts is usually obtained from public libraries such as Gene Ontology [30] or KEGG [31]. Similarly, one can test whether the expression of genes located in specific portions of chromatin (i.e. cytobands or entire chromosome) are involved in certain experimental conditions. For any of the annotations used for grouping the genes, the terms are defined a priori and constructed independently from the experimental data. The DAVID database is one of the most reliable tools for annotating genes and transcripts, as well as for finding overrepresented functional groups of genes in a given gene list. Interestingly, as many as 97% of all the significant EG entities were mapped into DAVID, while only 66.2% and 65.7% respectively from the RS and ENS presented a reliable DAVID annotation. Therefore, we conclude that the re-annotation of the Affymetrix probes according to the Entrez Gene database is the best in terms of gene annotation and functional analysis. Similar results were observed also during the re-annotation of the Affymetrix probes for human genes [32,33]. Investigating gene expression by microarrays presents some limitations especially related to the fact that microarrays can estimate only the levels of the transcripts within the cells, not giving any information concerning the post-transcriptional regulations. In addition, only rigorous statistical methods allow keeping the false discovery rates at reasonable levels, as many technical sources of variations can affect the measurements. Other restrictions to be considered when working with large-scale gene expression studies consist in the inaccuracy of the functional annotation of some genes: for instance, many genes are annotated in the ontology ''apoptosis'', without being directly correlated to this specific process, but simply being involved more generally in cell homeostasis. Careful inspection is always needed for a correct interpretation of long gene lists resulting from microarray assays. Finally, microarrays detect transcripts at very low concentrations, but when working with complex tissues, such as the brain, assigning the expression patterns to a certain cellular subpopulation is impossible. Nevertheless, we believe that the tissues should always be intended as functional entities and their global gene expression should be target of interest.
When examining whole tissues, gene expression from a variety of cell types, including non-neuronal tissues (especially blood), is recorded. This explains why we observed gene groups of haemoglobins and oxygen transport as significantly over-represented in MesE11 over MesPC. Overall, our results suggest that the MesPC are a reliable tool to be used in developmental neurobiology, as their gene expression programs largely resemble the tissue of origin. As these cell cultures comprise a mixed cell population, they are used by the scientific community because they mirror the midbrain neuronal composition better than more homogeneous cell lines [19,23]. Thus we believe that it is important to know what the gene expression profiles are in these cultures. We found genes of the extracellular matrix and of the focal adhesions to be upregulated in MesPC. In the mesencephalon, the adhesion structures are synthesized and maintained by glial cells, that are absent in primary cultures. The protocol used to establish midbrain neuronal cultures enhances dopaminergic differentiation, thus leading to enrichment in positive neurons, when compared to standard cultures [34]. However, some genes of the dopamine biosynthesis were found over-expressed in MesE11. This finding is consistent with the decrease of TH mRNA observed during the in vitro culture progression [19,23,34], and can be due to the decrease of trans-synaptic stimulation following the dissociation of the tissue.
We identified Egr1 and Sp1 as key elements in the regulation of the transcription patterns in MesPC. Particularly, early growth response genes encode for transcription factors that regulate gene expression in response to a variety of stimuli influencing cell growth and differentiation, as well as response to injury and reaction to chronic nervous system diseases [35]. Following depolarization, transcription of the Egr1 gene increases in the MesPC [36], suggesting that this immediate early transcription factor might be a key gene mediating the electrical activity leading to neuronal differentiation. Mice lacking EGR genes present a wide range of developmental abnormalities, including infertility [37], defects of the hindbrain morphogenesis [38], defective  myelination in the peripheral nervous system [39], and defects in learning and memory [40,41]. Gene network and promoter alignment techniques suggested that several key transcripts found in MesE11 can be regulated by the binding of transcription factors of the NEUR and NR2F families. Moreover, the levels of Neurod3 (NEUR) and Nr2f2 (NR2F) were found over-expressed in MesE11. Neurogenic basic helix-loop-helix (bHLH) factors Mash1, neurogenins (Ngns) and NeuroD3 (also known as Ngn1) play important roles in Nurr1-induced mDA neuronal differentiation [42]. While the role of Ngn2 in the development of mDA neurons is well known [43,44], less information is available concerning Neurod3. In addition to inducing neurogenesis by functioning as a transcriptional activator, Neurod3 seems to inhibit the differentiation of neural stem cells into astrocytes [45]. COUP-TFII (Nr2f2) seems to be involved in tangential GABAergic interneurons migration in the developing brain, through the regulation of short-and long-range guidance cues [46]. Functions of Nr2f2 in the mDA phenotype definition are still to be described. In summary, we present a broad view of the transcriptome of the DA neurons in primary cultures and in Mesencephalon E11. By employing gene network techniques, we propose novel models that could explain the transcription regulatory events taking place in the maintenance of the transcriptional identity of the mesencephalon, and in primary cultures derived from this CNS area, with a crucial role in physiology and pathology.  age was determined by considering the day of insemination (as confirmed by vaginal plug) as day E0. Prenatal brains were quickly removed and placed in phosphate buffered saline (PBS) without calcium and magnesium and supplemented with 33 mM glucose. The ventral midbrain was carefully dissected under a stereoscope in sterile conditions and processed for cell cultures (MesPC) or RNA isolation (MesE11). The use of animals was approved by the Institute of Genetics and Biophysics ethical committee and is in agreement with the European Community Directives. All efforts were made to minimize animal suffering and to reduce the number of animals used.

Cell Cultures
Cells were dissociated from the embryonic rat ventral midbrain and cultured as described [19,22,23]. In brief, the tissues were dissected from E11.5 embryos and dissociated using mechanical trituration with a fire polished Pasteur pipette in culture medium (see below) and 0.01% pancreatic deoxyribonuclease (Sigma, Milan, Italy); cells were centrifuged 10 min at 500 g, suspended in Neural Basal Medium (NBM, Invitrogen, Milan, Italy), counted and plated in NBM at a density of 18.000/cm 2 in dishes coated with 15 ug/ml of poly-D-Lysine (Sigma). Multiwell plates (Corning Costar, Milan, Italy) were used for all cultures. NBM was supplemented with B27 (Invitrogen, Milan, Italy), fibroblast growth factor 2 (FGF2, 20 ng/ml, Sigma), the N-terminal fragment of Sonic hedgehog protein (SHH, 50 ng/ml) and fibroblast growth factor 8 (FGF8, 10 ng/ml) for 6 days in vitro. SHH was purified as previously described [23]. Half of the medium was changed every three days. After six days the medium supplements were withdrawn with the exception of B27, and was added the ascorbic acid. Cultures were left for an additional three days.

Microarray strategy and sample preparation RNA isolation
To minimize biological variability four pregnant rats were sacrificed and the E11 embryos were mixed obtaining three groups. Each group was treated separately. The ventral midbrains dissected from each group were pooled (MesE11) or dissociated and the cells were cultured in duplicate. After 9 days in vitro the cultures were collected (MesPC). Three microarrays were hybridized with MesE11 and three with MesPC independent samples.

RNA isolation and RT-PCR
RNA obtained from tissues or from primary cultures was extracted using the Tri-Reagent isolation system (Sigma) according to the manufacturer's instructions. The RNA from the culture duplicates was pooled. The yield and integrity of RNA were determined by a spectrophotometer of A 260 and agarose gel electrophoresis respectively. Total RNA was treated with a DNA free kit (Ambion Inc., Milan, Italy) to eliminate possible DNA contaminations. For microarray hybridation an additional cleanup of total RNA was performed using the RNeasy kit (Qiagen, Milan, Italy). RNA samples were further processed for microarray hybridization or for RT-PCR. RT-PCR analyses were as previously described [47,48]. In brief, two ug of RNA were reverse transcribed using random hexanucleotides as primers (New England Biolabs Inc., Milan, Italy, 6 mM) and 200 U of moloneymurine leukemia virus reverse transcriptase (Ambion). 1/20 of the reverse transcribed cDNA was amplified in a 25 ul reaction mixture containing AmpliTaq Gold DNA polymerase buffer (Applied Biosystem, Milan, Italy), 0.2 mM dNTPs (Finnzymes OY, Espoo, Finland), 0.4 mM each primer, 1.25 U AmpliTaq Gold DNA polymerase (Applied Biosystem) and 1 mCi [ 32 P]dCTP (3000 Ci/mmol, Amersham Biosciences, Milan, Italy). As previously described [48], different sets of primer pairs were used in the same reaction tube to co-amplify cDNA, together with primers for the hypoxanthine-phosphoribosyl-transferase (Hprt), a constantly expressed gene during CNS development, used as an internal standard [49]. After a first denaturing step at 95uC for 8 min, PCR amplification was performed for 28 cycles organized as follows: 95uC for 0.5 min; 56uC-58uC for 0.5 min; 72uC for 0.5 min and was followed by a final extension step (72uC for 5 min). The specificity of PCR primers was determined by performing BLAST searches against the databases. Non-reversetranscribed RNA templates and mock controls were always run in PCR reactions and never gave amplification products. The [ 32 P]labeled amplified products were separated by electrophoresis in 1.5% agarose gel, dried and exposed to a PhosphorImager screen (Amersham). Quantitation was achieved by integrating the volume areas of each fragment obtained from scanning the screens with PhosphorImager apparatus (Amersham), equipped with Image-Quant software. The ratio between the yield of each amplified product and that of the co-amplified HPRT allowed a relative estimate of the mRNA levels [48]. Triplicate samples allowed statistical analysis.

Probe preparation and microarray hybridization
Using the protocol supplied by the manufacturer (Affymetrix, Santa Clara, CA), double-stranded cDNA was synthesized from total RNA and was used to obtain biotin-labeled cRNA by an in vitro transcription reaction (ENZO Diagnostics, Farmingdale, NY). Biotin-labeled cRNA was fragmented and hybridized with Affymetrix RAE230A rat genome GeneChip microarrays, according to the manufacturer's protocol, after verifying the quality of the biotin-labeled cRNA on a Test Chip (Affymetrix).

Affymetrix probes re-annotation
We used a sequence-based re-annotation of the Affymetrix probes on RAE-230A chipset [28]. The R packages used in this study can be retrieved at http://brainarray.mbni.med.umich.edu/ Brainarray/Database/CustomCDF/CDF_download_v10.asp

Data Quality Control
Extensive quality control of the data microarray raw data has been carried out using the methods implemented in the BioConductor packages affy [50] and affyQCReport [51]. All the microarrays showed excellent quality according to the standards, thus all of them were considered for further analysis (data not shown available upon request from the authors).

Data preprocessing
The preprocessing was carried out with the methods implemented in R (http://www.R-project.org) and BioConductor (http://www.bioconductor.org). The CEL files were imported into R environment. After standard quality control (results not shown), the re-annotated data were preprocessed using the RMA algorithm [52]. The RMA allows robust estimation of inter-array variability by employing quantile normalization and by fitting a linear model for each probe set across all the arrays of the dataset.

Feature Selection
The statistical algorithms implemented in the package Limma [53] were employed for selecting the differentially expressed genes. The genes having p-value,0.001 after Benjamini-Hockberg correction [54] were considered as significantly differentially expressed.

Functional analysis
The DAVID gene annotation system was used in order to select over-represented biological terms. Default statistical parameters were employed [27].

Gene network and promoter analysis
For each contrast analyzed, the genes up-regulated in each array group were separately imported into the software Genomatix Bibiosphere to build up gene networks based on their cocitation in the literature as well as the presence of TFBS for known transcription factors in their promoter regions (http://www. genomatix.de/products/BiblioSphere/). The transcription factors (TF) presenting extensive or interesting connectivity within the network were chosen. Consequently, the genes presenting a significant consensus for the selected TFs were further analyzed. The promoter sequences of the genes were retrieved using the software Genomatix Gene2Promoter (http://www.genomatix.de/ online help/help eldorado/Gene2Promoter Intro.html) and analyzed with Genomatix FrameWorker (http://www.genomatix.de/ online help/help gems/FrameWorker.html) to search for common models containing at least two TFBS. Finally, the significant models were screened for the whole set of known Rattus norvegicus promoters by using Genomatix ModelInspector (http://www. genomatix.de/online help/help fastm/modelinspector help.html).

Statistical analysis
The analyses applied to the microarray data have been described above. For all other experiments, analysis of variance was carried out, followed by post hoc comparison (ANOVA, Scheffè F-test). Data were expressed as mean+/2SEM. Figure S1 Summary of the PCR validations of the microarray results:The Nr4a2, Th, and Egr1 gene expression have been tested by PCR and statistically validated as described in materials and methods. Found at: doi:10.1371/journal.pone.0004977.s001 (0.04 MB TIF)

Supporting Information
Table S1 Genes significant in the three analyses with a DAVID annotation. Fields: DAVIDID: DAVID unique ID; EG: Entrez Gene ID; logFC_EG: logarithmic fold change (MesPC-MesE11) based on the EG re-annotation; P.Value_EG: p-value based on the EG re-annotation; adj.P.Val_EG: adjusted (Benjamini-Hochberg) p-value based on the EG re-annotation; ENSID: Ensembl gene ID; logFC_ENS: logarithmic fold change (MesPC-MesE11) based on the ENS re-annotation; P.Value_ENS: p-value based on the ENS re-annotation; adj.P.Val_ENS: adjusted (Benjamini-Hochberg) p-value based on the ENS re-annotation; RSID: RefSeq ID; logFC_RS: logarithmic fold change (MesPC-MesE11) based on the RS re-annotation; P.Value_RS: p-value based on the RS re-annotation; adj.P.Val_RS: adjusted (Benjamini-Hochberg) p-value based on the RS re-annotation; GeneName: official gene name. Found at: doi:10.1371/journal.pone.0004977.s002 (0.15 MB XLS) Table S2 Functional analysis of the genes significantly overexpressed in MesPC. Each stack represents a group of related functional terms. Each term within each stack is in a row of the table. The number of genes annotated, the percent of representation of the term, and the enrichment p-value are shown.