A Large Number of Protein Expression Changes Occur Early in Life and Precede Phenotype Onset in a Mouse Model for Huntington Disease*S

Huntington disease (HD) is fatal in humans within 15–20 years of symptomatic disease. Although late stage HD has been studied extensively, protein expression changes that occur at the early stages of disease and during disease progression have not been reported. In this study, we used a large two-dimensional gel/mass spectrometry-based proteomics approach to investigate HD-induced protein expression alterations and their kinetics at very early stages and during the course of disease. The murine HD model R6/2 was investigated at 2, 4, 6, 8, and 12 weeks of age, corresponding to absence of disease and early, intermediate, and late stage HD. Unexpectedly the most HD stage-specific protein changes (71–100%) as well as a drastic alteration (almost 6% of the proteome) in protein expression occurred already as early as 2 weeks of age. Early changes included mainly the up-regulation of proteins involved in glycolysis/gluconeogenesis and the down-regulation of the actin cytoskeleton. This suggests a period of highly variable protein expression that precedes the onset of HD phenotypes. Although an up-regulation of glycolysis/gluconeogenesis-related protein alterations remained dominant during HD progression, late stage alterations at 12 weeks showed an up-regulation of proteins involved in proteasomal function. The early changes in HD coincide with a peak in protein alteration during normal mouse development at 2 weeks of age that may be responsible for these massive changes. Protein and mRNA data sets showed a large overlap on the level of affected pathways but not single proteins/mRNAs. Our observations suggest that HD is characterized by a highly dynamic disease pathology not represented by linear protein concentration alterations over the course of disease.

ber of proteins available (13,14). In addition, unbiased, large scale approaches have identified a large number of genes potentially involved in HD because microarrays probe a very large number of expression alterations simultaneously at the mRNA level (15,16). Two-dimensional gel electrophoresis (2-DE)- (17,18) or liquid chromatography-based proteomics approaches in combination with mass spectrometry are able to investigate many protein alterations at the same time (19,20) increasing the number of proteins known to be involved in HD and complementing the protein interaction data.
HD has already been studied extensively at the transcriptomic level using brain tissue from a large number of mouse models (21)(22)(23)(24)(25) and humans (26) as well as human blood cells (27) using mostly late stages with visible disease phenotype. Proteome data on expression changes for HD are more scarce and so far only available for the R6/2 model (28 -31). A common denominator of these studies is that only stages with disease phenotype were investigated (28 -31). To understand the disease pathology in more detail, early changes need to be investigated. Here we present protein expression changes that occur at 2, 4, 6, 8, and 12 weeks of age in the well established R6/2 mouse model of HD representing the transition from the absence of disease-related phenotypes to a pronounced symptomatic disease state. Our results show that a large number of protein alterations are present prior to the developments of disease phenotypes. In addition most protein alterations were found to be disease stage-specific, and there were no proteins that were found to be altered at every stage investigated.
Protein Extraction Procedure-Total protein extracts were prepared from entire brains. The extraction procedure has been published previously and validated (30,35). Frozen tissue, 1.6 parts (v/w) buffer P (50 mM Trizmaா (Tris base) (Sigma-Aldrich), 50 mM KCl, and 20% (w/v) glycerol at pH 7.5) supplemented with a final CHAPS concentration of 4% (w/v) in the sample, 0.08 parts protease inhibitor solution I (one Complete TM tablet (Roche Applied Science) dissolved in 2 ml of buffer 1), and 0.02 parts protease inhibitor solution II (1.4 M pepstatin A and 1 mM phenylmethylsulfonyl fluoride in ethanol) were ground to a fine powder in a mortar precooled in liquid nitrogen. The tissue powder was transferred into a 2-ml tube (Eppendorf, Hamburg, Germany), quickly thawed, and supplied with glass beads (0.034 units of glass beads/combined weight of tissue, buffers, and inhibitors in mg; glass beads, 2.5 Ϯ 0.05-mm diameter; Worf Glaskugeln GmbH, Mainz, Germany). Each sample was sonicated six times in an ice-cold water bath for 10 s each with cooling intervals of 1 min 50 s in between. The homogenate was stirred for 30 min in buffer P without CHAPS at 4°C in the presence of 0.025 parts (v/w) Benzonase (Merck) and a final concentration of 5 mM magnesium chloride in the sample. Subsequently 6.5 M urea and 2 M thiourea were added, and stirring was continued for 30 min at room temperature until urea and thiourea were completely dissolved. The protein extract was supplied with 70 mM dithiothreitol (Bio-Rad), 2% (v/w) ampholyte mixture (Servalyte pH 2-4, Serva, Heidelberg, Germany), corrected by the amount of urea added (correction factor ϭ sample weight prior to addition of urea/sample weight after addition of urea), and stored at Ϫ80°C. Protein concentrations were determined in sample aliquots without urea using Bio-Rad DC Protein Assay according to the protocol supplied by the manufacturer.
2-DE-After genotyping, mice were allocated to the HD or control group. Sample pairs were randomly selected choosing one mouse brain from each group. 2-D gels were run in batches of two. Sample pairs consisting of an HD and a control were run in parallel in both dimensions of 2-DE, IEF, and SDS-PAGE. Different sample pairs were processed at different days to avoid confounding of the experiment due to the same processing date. Protein samples were separated by the large gel 2-DE technique developed in our laboratory as described previously (17,18). The gel format was 40 cm (isoelectric focusing) ϫ 30 cm (SDS-PAGE) ϫ 0.75 mm (gel width). For IEF using the carrier ampholyte technique, we applied 6 l (20 g/l) of protein extract of each sample to the anodic end of an IEF gel (40 cm) and used a carrier ampholyte mixture to establish a pH gradient from 3 to 10. Proteins were visualized in SDS-PAGE polyacrylamide gels by high sensitivity silver staining (18,36). For SDS-PAGE the IEF gels were cut in half and run as acidic and basic sides. 2-D gels were dried (described extensively in Ref. 36) and scanned at 300 dpi and 16-bit gray scale using a scanner (Microtek Scan Maker 9800XL, Evestar GmbH, Willich, Germany). The 2-D gel images were subsequently saved in Tiff format to avoid loss of quality due to compression.
Quantitative Analysis of Protein Expression-After uploading the 2-D gel images, protein spot patterns were evaluated by Delta2D imaging software version 3.5 (DECODON, Greifswald, Germany) as was already described recently in detail elsewhere (37). Delta2D is our standard 2-D gel evaluation software and was validated already in many of our studies (30, 35, 38 -41). Briefly 2-D spot patterns of HD and control mouse brains were matched using the Delta2D "exact" mode matching protocol. First sample pairs (HD and control) were matched individually. Subsequently all HD gels were matched to create a "match link" between all 2-D spot patterns using match vectors (37). Using "union" mode a fusion image was generated, including the visible spots for each 2-D gel from each time point (2,4,6,8, and 12 weeks) creating 10 fusion gels (five time points, one fusion gel for the acidic side and one for the basic side of 2-D gels). Only the fusion image was used for spot detection using the following settings for Delta2D: local background region, 100; average spot size, 1; and sensitivity, 100%. Spots were not edited manually after spot detection. About 2000 spots on the fusion image were transferred to all other 2-D gel images for each time point. This ensures that for each stage investigated the identity for each spot on a gel is identical. Relative spot volume intensities (fractions of 100%) were used for quantitative protein expression analysis. After background subtraction, normalized spot intensity values were copied into Excel spreadsheets for statistical analysis. Data sets were analyzed applying a Student's t test. In the case of HD versus control comparisons a paired t test was used to compare sample pairs run side by side during both electrophoresis runs. In longitudinal studies where differ-ent age stages of HD or control were compared with one another, we used an unpaired t test because no natural pairing exists. Pairs were randomly selected from each group (time points) compared. The 2-D gel evaluation procedure by Delta2D remained the same as in HD versus control comparisons.
To determine changes in total protein concentration we determined the protein amount (gray value determined by Delta2D 2-D gel evaluation software) changed for all up-or down-regulated proteins for each stage. Now the sum of the protein amounts for e.g. all significantly up-regulated proteins from controls was subtracted from the sum of all HD proteins. Therefore we obtained the actual amount of changed protein. To determine the total protein amount changed we added the amount for up-and down-regulation. To determine whether total amount of protein changed was significantly different between stages we used an unpaired t test. The total amounts changed for each 2-D gel pair (five HD versus control repeats) were calculated separately. We subsequently compared the total amounts of each stage with its adjacent stages by t test. All protein amounts are relative as they are altered as compared with control, and although changes in gray scale value (spot volume) are proportional to protein concentration changes the absolute concentration values are not available. Sample size comprised at least five biological sample pairs. The Student's t test was used because the data investigated were normally distributed. The Kolmogorov-Smirnov Z test provided by the statistical analysis software SPSS 16.0 (SPSS Inc., Chicago, IL) was used to determine the "normal" distribution of our data. SPSS calculates a "two-tailed significance level," testing the probability that the observed distribution is significantly deviant from the expected normal distribution. That is, a finding of non-significance means that the sample distribution is normal. We used the data set of statistically significant protein isoforms (p Ͻ 0.05, paired Student's t test) at 8 weeks to check for normal distribution. This data (sub)set was chosen because (i) testing all data points (4816 spots) was very difficult using the program available (SPSS) and (ii) the significant changes were the relevant ones for this study. We investigated the up-and downregulated spots separately. The two-tailed significance level for upand down-regulated protein isoforms was on average 0.81 Ϯ 0.21 and 0.86 Ϯ 0.17, respectively. That is, the null hypothesis (normal distribution) is true (p Ͼ 0.05), and the data are therefore normally distributed. In addition, all protein isoform changes tested were distributed normally.
The rate of false positives was estimated according to the following equation for unpaired t tests.
We assumed that the null hypothesis that there is no difference between samples tested is valid. In addition, x 1 and x 2 are the sample means of the distributions, and s 1 and s 2 are the corresponding standard deviations. n 1 and n 2 indicate the number of sample pairs used in our study. Because we assumed an equal standard normal distribution of both data sets to be tested for false positives we assume that the standard deviation s 1 ϭ s 2 . Z denotes the desired confidence and was determined to be 2.015 for Q (0.95) (0.95 quantiles) and n ϭ n 1 ϭ n 2 ϭ 5 according to the "quantiles of Student's t distribution." The false positive rate was therefore determined to be smaller than 15% of the significant changes for an unpaired t test. This value is even smaller for paired t tests.
Evaluation of False Positive Protein Changes-To determine the reliability of the results obtained by our HD time course investigation it was important to establish a base line of protein alterations that occur even if no disease is present. Therefore we selected a repre-sentative time point, 8 weeks, and compared eight control gels with each other. We randomly allocated the eight 2-D gels into two groups of four each. Those groups were analyzed in the same way as HD and control gels were. We analyzed a total of 4031 protein spots by an unpaired Student's t test (p Ͻ 0.05) and found that 17 spots were up-regulated and 22 were down-regulated in expression. This makes a total of 39 spots changed at 8 weeks without disease present. When comparing this result with the changes obtained during our time course study where 205, 42, 40, 157, and 240 spots were altered between HD and control at 2, 4, 6, 8, and 12 weeks, respectively (Table I), we found a percentage of false positives of 19, 93, 98, 25, and 16%, respectively. In addition, a direct comparison of changes at 8 weeks (157 spots) shows that 25% of the proteins may be false positives, or at least 118 changes were identified correctly.
Protein Identification-For protein identification by mass spectrometry, 40 l of extract was separated by 2-DE and stained using an MS-compatible silver staining protocol (42). Protein spots of interest were excised from 2-D gels and subjected to in-gel tryptic digestion. Tryptic fragments were analyzed by nanoflow HPLC (Dionex/LC Packings, Amsterdam, Netherlands)/ESI-MS and -MS/MS on an LCQ Deca XP ion trap instrument (Thermo Finnigan, Waltham, MA). Nanoflow HPLC was directly coupled to ESI-MS analysis. Protein spot eluates of 15 l were loaded onto a PepMap100 C 18 precolumn (5 m, 100 Å, 300-m-inner diameter ϫ 5 mm; Dionex/LC Packings) using 0.1% (v/v) trifluoroacetic acid at a flow rate of 20 l/min. Peptides were separated onto a PepMap100 C 18 100 column (3 m, 100 Å, 75-m-inner diameter ϫ 15 cm; Dionex/LC Packings). The elution gradient was created by mixing 0.1% (v/v) formic acid in water (solvent A) and 0.1% (v/v) formic acid in acetonitrile (solvent B) and run at a flow rate of 200 nl/min. The gradient was started at 5% (v/v) solvent B and increased linearly up to 50% (v/v) solvent B after 40 min. ESI-MS data acquisition was performed throughout the LC run. Three scan events, (i) full scan, (ii) zoom scan of most intense ion in full scan, and (iii) MS/MS scan of the most intense ion in full scan, were applied sequentially. No MS/MS scan on single charged ions was performed. Raw data were extracted by the TurboSEQUEST algorithm, and trypsin autolytic fragments and known keratin peptides were subsequently filtered. All DTA (peak list files for mass spectrometry results generated by the SEQUEST search algorithm) files generated by BioWorks version 3.2 (Thermo Scientific, Waltham, MA) were merged and converted to MASCOT generic format files (MGF). Mass spectra were analyzed using our in-house MASCOT software package license version 2.1 automatically searching the NCBInr database for Mus musculus (house mouse) (NCBInr_20061206, 107,853 sequences). The M. musculus subset of the NCBInr database was used because only mouse samples were investigated. In rare cases, hits were researched using the Mammalia subset of the NCBInr database. All non-M. musculus proteins are indicated in supplemental Table 1 by addition of either "Homo sapiens" or "Rattus norvegicus" after their protein name. To reduce the length of the protein names for the large majority of M. musculus identifications because of space constraints the species label was omitted in many cases. MS/MS ion search was performed with the following set of parameters: (i) taxonomy, M. musculus (house mouse); (ii) proteolytic enzyme, trypsin; (iii) maximum of accepted missed cleavages, 1; (iv) mass value, monoisotopic; (v) peptide mass tolerance, 0.8 Da; (vi) fragment mass tolerance, 0.8 Da; and (vii) variable modifications, oxidation of methionine and acrylamide adducts (propionamide) on cysteine. No fixed modifications were considered. Only proteins with scores corresponding to p Ͻ 0.05 with at least two independent peptides identified were considered. The cutoff score for individual peptides using ESI identification was equivalent to p Ͻ 0.05 for each peptide and usually in a MOWSE (molecular weight search) score range from 32 to 37. This number was calculated by the MASCOT software. Furthermore the-oretical and practical molecular weight and pI for each protein identified by database search were compared to remove proteins with deviating masses and pI values.
Pathway Enrichment Analysis in the Protein Data Set-Official gene symbols and gene names (Mouse Genome Informatics) were used to investigate similarities in protein expression alterations between stages. The gene names were retrieved using the GI numbers supplied by MASCOT after a database search. To investigate an enrichment of specific pathways in the altered protein expression data set, we used the "Web-based gene set analysis toolkit" (WEBGESTALT) tool supplied by Vanderbilt University. We used the "Gene set analysis tool" and selected the gene set analysis option "Function" and the category "KEGG table and maps." KEGG is the abbreviated form of Kyoto Encyclopedia of Genes and Genomes and is a bioinformatics database containing information on genes, proteins, reactions, and pathways. The following parameters were used to create the KEGG tables: reference set, "WEBGESTALT_MOUSE"; significance level, p Ͻ 0.01; and minimum number of genes, 2. Statistical methods available were "hypergeometric test" and "Fisher's exact test." For our data the results were the same with either test.
Analysis of mRNA Data Sets for Co-regulation with Protein Expression Data Sets-We used an mRNA data set for the R6/2 HD mouse model for 6, 9, and 12 weeks that had been published previously (22,43). Only the striatal data set was utilized. The data sets were already analyzed for statistical significance in their respective studies (22,43). Briefly Affymetrix microarrays were normalized using robust multiarray averaging. Analysis was performed using R version 2.3 and the Bioconductor packages Affy and Limma. Differential gene expression in each array set was assessed relative to unaffected controls using paired t tests. Random matching generated six HD-control sample pairs (43). The mRNA data set consists of 22,626 probe set identities. From the entire mRNA data set only genes with differential regulation at the protein level were selected for analysis. On the single mRNA/protein level we found that 88% of 371 altered proteins could be correlated to corresponding mRNA data. A common gene symbol (mouse variant) of protein and mRNA data was used as the selection criterion. We now had mRNA expression data for 328 of our proteins at 6, 9, and 12 weeks for HD and control at our disposal. We now determined whether any significant changes occurred in any of the 328 mRNAs at each of the three stages. The significance level for altered mRNA expression selected for our study was p Ͻ 0.05. If probe sets were not altered significantly in more than 66% of cases (note that more than one probe set per gene name is present on the mRNA chip) they were discarded. In the three data sets at 6, 9, and 12 weeks of age, one (Rbmx), three (Eef1a2, Mapre2, and Nono), and zero genes showed opposite regulatory behavior, respectively. Opposite regulatory behavior was present when individual probe sets coding for the same gene name on the mRNA array showed opposite expression behavior. The number of statistically significant mRNA alterations of the total of 328 was determined for each stage. The gene names for altered proteins were compared with the gene names of mRNAs to determine the overlap in expression changes between protein and mRNA on the single mRNA/protein level.
To investigate the overlap in cellular pathways between the mRNA and the protein data sets we used WEBGESTALT using the same parameters as already described earlier under "Pathway Enrichment Analysis in the Protein Data Set." To determine enriched pathways we used all significantly changed probe sets from the mRNA data sets at the stage investigated and not just those altered also in the protein data set. RESULTS We investigated the HD mouse model R6/2 for changes in the expression levels of proteins during phenotype onset and progression using a 2-DE gel-based proteomics approach. We observed two stages at which an extraordinarily large number of protein alterations had occurred. These peak alterations were present prior to onset and after the development of pronounced HD-related phenotypes. In addition, most of the changes were found to be stage-specific.
Altered Protein Expression during HD Progression in R6/2 Mice-To cover onset and all stages of disease progression in the well characterized R6/2 HD mouse model, we selected ages starting where no phenotype is present (2 weeks) and finishing where mice demonstrate pronounced symptoms (12 weeks). In our colony, disease end point is defined by 20% loss of body weight that occurs at 14 -15 weeks of age. Further time points were selected to correspond to stages where hallmarks of disease progression appear. Loss of brain weight occurs from 4 weeks of age (44), an impairment of motor function as measured by RotaRod analysis is present from ϳ6 weeks (45), and a visible phenotype is present from ϳ8 weeks (45) (Fig. 1). After comparing R6/2 with control mice for 2, 4, 6, 8, and 12 weeks separately the number of differentially expressed proteins for each time point investigated was determined (Table I). For the 2-, 4-, 6-, 8-, and 12-week time points, a total of 3821, 4006, 4030, 4816, and 4283 protein isoforms were analyzed, respectively, to determine significant protein expression differences. As expected very few protein isoforms were altered in R6/2 mice at 4 (42 isoforms, 1.0% of total isoforms investigated at this age) and 6 weeks (40 isoforms, 1.0%) of age because the phenotype at these stages is very mild. The number of altered protein isoforms increased more than 3-fold at 8 weeks (157 isoforms, 3.3%) and a further 1.5-fold at 12 weeks (240 isoforms, 5.6%).
Unexpectedly we found a large number of protein isoforms altered at 2 weeks of age (205 isoforms, 5.4%). This number was almost as high as those detected at 12 weeks (Table I). We suspected that the number of protein isoforms changed at 2 weeks might not necessarily reflect a drastic change in the total amount of altered protein. Therefore, we determined the relative amount of protein changed (Fig. 2) and found that 3.1% of the total spot volume (protein concentration) was altered at 2 weeks comparable to 3.2% at 12 weeks of age. Again the values were almost identical. The y axis of Fig. 2 indicates the amount of protein changed relative to a total protein concentration of 100% for a 2-D gel.
We subsequently identified the altered protein isoforms by mass spectrometry. Table II shows the number of identified proteins for each stage separately. The identification rates were in general very high, ranging from 83 to 94% except for at 6 weeks (53%). However, at 6 weeks, only a very small number of proteins were altered (40), and therefore, the failure to identify specific proteins makes a huge difference to the percentage identified. Because of the generally high identification ratio, the proteins changed represent the entire data set of altered protein spots (Table I). A low identification rate may leave an important subset of proteins beyond scrutiny and may bias the study for easily identifiable proteins. It is well established that a protein may be represented by more than one protein isoform (spot) on a 2-D gel. Therefore we determined the number of non-redundant proteins that were changed for each stage. We used the gene name as a selection criterion: proteins sharing the same gene name were considered as one non-redundant protein. Therefore, proteins with a different protein name but the same gene name and a protein altered in more than one isoform (seen on the 2-D gel as a protein spot (18)) on the 2-D gel were considered to be the same protein and counted only once per stage. Table IIIA shows the number of non-redundant proteins changed at each disease stage. As might have been expected, at each stage proteins were represented by more than one protein spot on the 2-D gel. The number of non-redundant proteins was 73, 89, 86, 77, and 70% of the total number of proteins identified at 2, 4, 6, 8, and 12 weeks, respectively. In total 371 individual, non-redundant proteins were identified. Interestingly almost all proteins represented by more than one spot showed the same regulatory pattern in all isoforms; that is, they were either up-or down-regulated (Table IIIB).
Drastic Protein Expression Changes Early in Disease Precede HD-related Phenotypes-The most unexpected observation was that an early peak in protein alterations in terms of numbers (Tables I-III) and amount ( Fig. 1) was observed prior to phenotype onset, and we sought to identify the mechanism underlying this early peak. It is already known that protein changes related to development that are present at 2 weeks of age in the mouse are drastically reduced in adulthood (38). We investigated the longitudinal changes in R6/2 and wildtype mice to elucidate the magnitude of protein changes during development in the presence and absence of disease (Table IV). This means we compared each stage within a group (HD or control) with its adjacent stages. The number of FIG. 2. Relative protein concentration changes during HD progression. Total brain extracts of R6/2 mice were studied at 2, 4, 6, 8, and 12 weeks of age. Protein concentration alterations were calculated for all significantly altered protein isoforms (p Ͻ 0.05, Ͻ0.9or Ͼ1.1-fold change). The total spot volume (spot intensity ϫ spot area) on each 2-D gel was set to 100%. The y axis indicates the amount of protein changed relative to a total protein concentration of 100%. Asterisks indicate statistical significance (p Ͻ 0.05). Black bars indicate up-regulated proteins, and gray bars indicate down-regulated proteins. The numbers above each bar indicate percentage of change. The line above the bars indicates the sum of up-and down-regulated protein concentration changes that is also indicated by numbers.  protein isoform changes between 2 and 4 weeks, 318 in HD and 298 in wild type, were more numerous than all subsequent changes during longitudinal development and its interaction with HD disease progression. Interestingly the number of changes in wild-type and R6/2 mice reached another peak between 6 and 8 weeks (Table IV). These changes were reproduced when alterations in protein amount (concentration) were considered: again alterations were most dramatic between 2 and 4 weeks in HD and wild-type mice (16.3 and 13.3%, respectively) ( Table V). All mice with nominal age 2 weeks were sacrificed exactly 14 days after birth, but to rule out a difference in "developmental age" between R6/2 and control mice at 2 weeks of age we compared the body weight of both groups. R6/2 mice had an average weight of 7.3 Ϯ 0.75 g, and controls had an average weight of 7.54 Ϯ 1.15 g. Brain weights were normally distributed and showed no statistically significant difference (p ϭ 0.814, paired Student's t test).
Therefore, the number and extent of protein expression changes occurring between 2 and 4 weeks were large regardless of the presence of disease. We now used C57BL/6 mice to study expression changes during early development before 2 weeks of age (Table VI). When comparing the results for wild-type CBA ϫ C57BL/6 and inbred C57BL/6 mice, we    (Tables IV, V, and VI). When looking at earlier changes it becomes clear that the amount of change is considerably higher between time points in earlier mouse developmental stages (35). We analyzed protein changes at embryonic days 16 and 18 and postnatal day (P) 0, P7, P14, and P28 (n ϭ 3 per time point). We then calculated the number of isoforms and protein amount changed between consecutive time points and converted this to the average number of isoforms changed per day for comparison purposes (Table  VI). A peak in the change in protein amount and isoform number was observed between P7 and P14 (1 and 2 weeks after birth; Table VI and Fig. 3). These data suggest that during normal development the changes in protein expression that occur at early stages (before 2 weeks) are already very pronounced. Generally the changes per day are very high except shortly before birth (Table VIB). A perturbation such as the expression of the transgenic fragment of the HD gene in R6/2 mice (32) may easily disturb a delicate equilibrium of expression changes during normal development. This may explain the large number of protein expression changes observed early in disease at 2 weeks (Tables I-III and Fig. 2).

Overlap of Protein Expression Changes between Different
Stages of Disease Progression-After studying the possible causes of the protein expression overlap, the identity and properties of the proteins generating this early peak in differential expression were investigated. First the degree of similarity between the proteins that change in disease was investigated. We compared the gene names corresponding to the non-redundant proteins identified for each stage and found a protein expression overlap of 6 -38% depending on the stages compared (Table VIIA). The protein expression overlap is defined as the percentage of gene names shared between two stages compared. Interestingly the overlap between 2 and 12 weeks of age was very high (38%). We then considered whether the direction of change in expression (i.e. up or down) was concordant or discordant between stages (Table  VIIB). It became clear that even if the same proteins were differentially regulated at both 2 and 12 weeks the direction of expression was not necessarily the same. Correction for the direction of expression, that is the same genes in the stages compared with opposite regulation (i.e. up versus down) were removed, reduced the overlap to 27%. This means that 73% of the protein changes were specific for 2 and 12 weeks of age. The overlap in differentially expressed proteins was corrected to account for the direction of change for all comparisons and found to range from 0 to 29% with the lowest being between 6 and 8 weeks of age (Table VIIA). Although an expression overlap of up to 29% is still large, overall between 71 and 100% of the proteins that were differentially expressed were stage-specific (Table VII).
Early Changes in Energy Metabolism during HD Pathology-To obtain a deeper understanding of the processes involved in disease progression it is important to determine whether specific pathways are enriched within the altered protein data set for which we carried out a KEGG analysis (46). We analyzed all time points separately and only included pathways where at least three proteins were enriched. This cutoff was chosen to ensure that a reasonable number of proteins were altered for a given pathway thereby providing strong evidence for altered regulation. Pathways enriched in up-regulated and down-regulated proteins were identified (Table VIII). Interestingly glycolysis/gluconeogenesis was found to be at the top FIG. 3. Longitudinal changes of protein expression during development. Changes in protein expression per day in terms of number of proteins (A) and protein concentration (B) were determined. Total brain extracts of embryonic day 16 and 18 and neonate P0, P7, P14, and P28 wild-type C57BL/6 mice were compared. Altered protein numbers and concentration were calculated for all significantly altered protein isoforms (p Ͻ 0.05, Ͻ0.9or Ͼ1.1-fold change). For a complete list of all alterations see Table VI. of the up-regulated categories at all stages except at 12 weeks of age when most up-regulated proteins were found in pathways involving proteasome function. In contrast some proteins involved in glycolysis/gluconeogenesis were down-regulated at 12 weeks. Of other down-regulated pathways, the "regulation of actin cytoskeleton" was at the top of the list at 2 and 12 weeks of age, and proteasome function was down-regulated at 2 weeks of age. In summary, "glycolysis/gluconeogenesis" was mostly up-regulated, whereas "regulation of cytoskeleton" was down-regulated. The regulation of proteasome function seems to be stage-specific.
When the overlap between proteins of altered expression was considered, those overlapping between 2 and 12 weeks of age (Table VII) belong to glycolysis/gluconeogenesis, the "pentose phosphate pathway," and "proteasome". In contrast, when the protein expression overlap from 8 and 12 weeks was compared, glycolysis/gluconeogenesis was still the top ranking enriched pathway followed by "Parkinson disease" (Ube1x and Uchl1), and "metabolism of xenobiotics by cytochrome P450" as well as "MAPK signaling pathway" were also included. At the level of individual proteins, only nine gene names, Cops4, Efhd2, Mtpn, Phpt1, Sept7, Slc25a12, Stmn1, Tppp, and Uchl1, were identified in the expression overlap data sets of 2 versus 12 weeks and 8 versus 12 weeks. Therefore, because a total of 49 genes overlapped between 2 and 12 weeks and 35 overlapped between 8 and 12 weeks, there is an overlap of 23% between both data sets at the individual protein level.
We now investigated the possibility that because some proteins within an altered pathway may be up-regulated and others may be down-regulated those pathways may be lost in an analysis that separates up-and down-regulated proteins. Table X shows that all pathways found with separate analysis (Table VIII) were also found when up-and down-regulated proteins were subjected to KEGG analysis simultaneously. Because some numbers of proteins of the pathways involved are higher in Table X, proteins of the same pathway show opposite regulation. Still most proteins were either up-or down-regulated. Still more altered pathways per stage were found with simultaneous analysis, and importantly in addition to glycolysis/gluconeogenesis, "oxidative phosphorylation" was found to be altered at all stages except 6 weeks (Table X). a The expression overlap corrected by the relative similarity of expression orientation between mRNA and protein datasets (number in parentheses). This common enrichment in altered pathways is clearly contrasted by the fact that only two proteins, represented by the gene names glyceraldehyde-3-phosphate dehydrogenase (Gapdh) and pleckstrin homology (PH) and SEC7 domaincontaining protein 3 (Psd3), were altered at four of the stages and that none were altered in all five stages. Gapdh catalyzes D-glyceraldehyde 3-phosphate, phosphate, and NAD ϩ to 3-phospho-D-glyceroyl phosphate and NADH and is involved in glycolysis, but a nuclear function has also been described. Psd3 acts as a guanine nucleotide exchange factor for ARF6 and is located at cell junctions, the presynapse, the postsynaptic cell membrane, and the postsynaptic density.
In summary, when analyzing our proteomics data set, we found two peaks in protein alteration, one early (5.4% of all protein isoforms changed, 2 weeks) and one late in disease (5.6%, 12 weeks). In addition, most changes at each time point investigated were stage-specific (71-100%; see Table VII).
Correlation of mRNA Expression Kinetics during HD Progression in R6/2 Mice-Recently mRNA expression data for R6/2 mice at 6, 9, and 12 weeks of age have been published (22,47), and we used these data sets to compare mRNA and protein expression data to determine the degree by which altered mRNA regulates protein expression. The data sets were generated from the striatum of 6-, 9-, and 12-week-old R6/2 mice. Of the 371 non-redundant proteins that we identified, only 39 were not represented in the mRNA data. Therefore we were able to correlate the expression profiles of 328 mRNAs (88% of the proteins) with their protein expression. Only two of the stages studied at the mRNA level (6 and 12 weeks) coincided directly with the stages studied on the proteome level (2, 4, 6, 8, and 12 weeks). We compared the mRNA expression data with adjacent protein expression data on the single mRNA/protein level. 6-week mRNA expression data were compared with 4-, 6-, and 8-week protein data, the 9-week mRNA data were compared with 8-and 12-week protein data, and the 12-week mRNA data were compared with 8-and 12-week protein data.
The comparison of altered mRNAs and proteins at 6 weeks revealed an overlap of 3 of 13 (23%), and at 12 weeks the overlap was 42% when the direction of alteration was not taken into account. At 6 weeks all three mRNAs that showed a statistically significant alteration were regulated in the opposite direction to the alteration in protein expression (up/down). At 12 weeks this was the case for 16 of the 58 altered mRNAs. Interestingly this opposite regulation was lower, 3 of 31 (10%), when the 9-week mRNA data set was considered (Table IX). Because the pathways that are differentially regulated at 2 and 12 weeks of age are quite similar, mRNA alterations at 12 weeks were compared with the proteins that were altered at 2 weeks. 42% of the mRNAs altered at 12 weeks were also altered at the protein level at 2 weeks. If those changing in the opposite direction are excluded the overlap is still 23% (Table IX). Therefore although there is co-regulation between protein and mRNA expression it is generally low on the level of individual proteins/mRNA.
We now investigated the overlap between pathways of the mRNA and protein data sets. We determined the number of differentially expressed proteins considering mRNAs with p Ͻ 0.05. Subsequently we carried out a KEGG analysis using 2775, 4338, and 2940 significantly altered, non-redundant genes for 6, 9, and 12 weeks, respectively. We compared upand down-regulated mRNAs and proteins for each stage together (Tables X and XI). Stage "8 weeks" of the protein data set was compared with "9 weeks" of the mRNA data set, and stage "12 weeks" of the protein data set was compared with 12 weeks of the mRNA data set. We found an overlap of 6 of 8 pathways at 8/9 weeks and 6 of 10 pathways at 12 weeks. No pathways were altered in the protein data set at 12 weeks. The top five pathways in Table XI are those listed at the top of a KEGG pathway analysis when using the mRNA datasets for each of the three age stages investigated. The additional pathways listed are altered in at least one of the stages from the protein data set. Interestingly the top scoring pathways of the mRNA data set such as oxidative phosphorylation and regulation of actin cytoskeleton were also altered in the protein data set (Tables X and XI). DISCUSSION In this study we investigated early changes in protein expression and followed these during disease progression in the R6/2 HD mouse model. We observed two peaks of altered protein expression, one at 2 weeks of age prior to the onset of phenotypes and one at 12 weeks when symptoms are pronounced. These changes corresponded to about 6% of the entire proteome (protein isoforms) studied. Although most alterations were stage-specific, in some cases e.g. proteins involved in glycolysis/gluconeogenesis were dysregulated at every stage. In addition there was a pronounced similarity between early and late changes at the protein and mRNA level. When comparing the mRNA and protein changes they showed a small overlap (Ͻ30%); that  (16) a Total mRNAs were determined by using the number of nonredundant proteins (see also Table IIIA (Total)) minus the number that are not represented in the mRNA data set (parentheses).
b Number of mRNAs regulated in the opposite direction (up/down) to the proteins is indicated. c ND, not done.
is over 70% of the changes were specific to mRNAs or proteins.
The identification of early disease-related alterations will play a key role in our understanding of HD. However, most studies focus on late stages of disease in mouse models once an overt phenotype has occurred or study patients with manifest disease or postmortem brains of people who have died at late stage disease. In this study, we investigated the disease kinetics of altered protein expression levels, starting prior to phenotype onset and continuing through known stages of the disease.
Peak in Early Changes at 2 Weeks of Age-The most startling result of this study was the extent to which protein alterations had already occurred at 2 weeks of age. The number of protein alterations was almost equivalent to those found at a late stage of disease and showed a significant overlap in terms of gene names. The pathway identified using the KEGG database that is most altered during disease progression is glycolysis/gluconeogenesis. Interestingly at 2 weeks, almost all glycolytic enzymes (phosphofructokinase (Pfkm), aldolase (Aldoa and Aldoc), Gapdh, phosphoglycerate kinase 1 (Pgk1), and pyruvate kinase (Pkm2)) were up-regulated in their expression level (supplemental Table 1). In addition, an up-regulation of energy metabolism was further supported by up-regulation of key enzymes of the citrate  The first five pathways selected were the top pathways in the KEGG analysis of the mRNA data set. The following pathways are those that overlap with any pathway in the protein data sets. The cutoff for inclusion into the table for the mRNA data sets was 10 mRNAs. Pathways marked bold overlap with protein expression data at 8 weeks (mRNA at 9 weeks) and 12 weeks. Only pathways with three or more proteins and p Ͻ 0.01 were included. Numbers in parentheses were obtained from a second dataset for 12-week-old R6/2 mice.  (16) cycle: aconitase 2 (Aco2), fumarate hydratase 1 (Fh1), and malate dehydrogenase 2 (Mdh2). This is in accordance with a weight loss that occurs early in HD despite a high caloric intake (48). Weight loss has clearly been established in preand early symptomatic patients, and a significant reduction in the concentration of branched chain amino acids could be detected in their plasma by nuclear magnetic resonance spectroscopy suggesting that a perturbed mitochondrial energy metabolism is relevant to early pathogenesis (49). These changes are also consistent with an early disturbance of the transcription factor PGC-1␣. The suppression of PGC-1␣ leads to mitochondrial dysfunction and therefore a dysfunction of the energy metabolism and neurodegeneration (50,51). At 2 weeks of age, an increase in "glutamate metabolism" was observed (Table VIIIA). Glutamate-ammonia ligase (Glul), glutamate dehydrogenase 1 (Glud1), glutamate oxalacetate transaminase 2 (Got2), and 4-aminobutyrate aminotransferase (Abat) were all up-regulated in expression. Glul converts glutamate to glutamine in astrocytes after glutamatemediated neuronal signaling. Glutamate is removed by astrocytes after signaling from the synaptic cleft (52,53), and glutamate conversion to glutamine is directly coupled to glutamate signaling (52,53). Consequently this increased Glul activity may indicate an increase in glutamate signaling at a very early stage in HD progression at least in the R6/2 mouse model studied. Increased glutamate signaling at the N-methyl-D-aspartate subclass of ionotropic glutamate receptors has been linked to excitotoxicity, especially in striatal neurons, which has been proposed as a pathogenic disease mechanism for HD (54).
Furthermore at 2 weeks of age proteins involved in exo-and endocytosis were altered in expression. Cplx1, Cplx2 (both down-regulated), Syn1, and Syn2 (both up-regulated) are involved in exocytosis. Pacsin1 (up-regulated; three isoforms) and Dnm1 (down-regulated) are involved in endocytosis (55). In addition, Arpc1a and Arpc5, members of the Arp2/3 complex linking Pacsin1 and N-Wasp to actin to ensure functional endocytosis (56), were up-and down-regulated, respectively. Hom-er1 (two isoforms) located in the postsynaptic density (57) was up-regulated in expression. This dysregulation of proteins involved in exo-and endocytosis suggests that synapses are perturbed early in the R6/2 HD model. These changes were not detected by KEGG pathway analysis because up-and downregulated proteins were investigated separately.
So far up-and down-regulated pathways were studied separately. Because proteins may be up-or down-regulated in the same pathways in disease we also considered pathway changes when up-and down-regulated proteins were subjected to KEGG analysis simultaneously. We found all pathways altered when we studied up-and down-regulated proteins separately. In addition we found that not only glycolysis/ gluconeogenesis but also "oxidative phosphorylation" was altered at all stages except 6 weeks (Table X). In summary the changes observed may be due to a disturbance of the deli-cate equilibrium during development because of the large amount of changes naturally occurring at this time (Tables IV  and V), or it is conceivable that the early changes act as a compensatory mechanism for early transgene Htt expression already in the absence of symptoms.
When considering the base-line level of false positives of 39 proteins per stage (see "Experimental Procedures" for more information) it becomes obvious that the protein alterations found at 4 and 6 weeks that are just above this level (Table I) should be treated with caution, whereas the other three time points that are more than 4-fold above this threshold are far more reliable. When considering this and the number of pathways altered at 2, 8, and 12 weeks in addition to confirmatory literature data the results of this study provide a valuable contribution to unraveling the dynamic protein changes in HD.
Changes Are Stage-specific on the Individual Protein Level but Overlapping in Terms of Metabolic Pathways-Unexpectedly we found that most of the changes identified at each stage were stage-specific. The degree of specificity ranged from 71 to 100% (Table VII). One important conclusion of this finding is that a disease is a complex process that is not represented by linear changes in protein expression starting early at low levels, e.g. 2 weeks, increasing steadily to high levels close to terminal disease (12 weeks; Fig. 1). However, the groups of metabolic pathways altered are relatively constant, especially glycolysis/gluconeogenesis and those involved in proteasome function. It is interesting that the number of changes varies considerably between stages ranging from 18 non-redundant proteins at 6 weeks to 158 at 12 weeks (Table III).
Proteasomal Alterations Dominate Late Changes-Alterations in the expression levels of proteins involved in proteasome function dominated at 12 weeks of age. Although proteasomal changes have already occurred at 2 weeks these were down-regulated. At 8 weeks they were up-regulated but less numerous, and at 12 weeks the following proteins represented by their gene names were up-regulated: PSMA3, PSMA5, PSMB1, PSMB3, PSMB4, PSMB5, PSMC3, PSMC5, PSMD4, and PSMD7. PSMA3 and PSMA5 are members of the 20 S proteasome ␣-subunits, whereas PSMB1, PSMB3, PSMB4, and PSMB5 are 20 S proteasome ␤-subunits (58). The ␤-subunits are catalytically active and responsible for proteasome specificity (58). So far 11 ␤-subunits have been identified in mice of which four were up-regulated in our study. In addition, all of the dysregulated subunits are essential subunits in yeast that are functionally conserved in humans (58). PSMC3, PSMC5, PSMD4, and PSMD7 belong to the regulatory or 19 S proteasome, which together with the 20 S complex forms the 26 S proteasome in eukaryotes (58 -60). PSMA3, PSMB4, and PSMD4 were already up-regulated at 8 weeks of age. In contrast, PSMA5, PSMA6, and PSMB6 were down-regulated at 2 weeks. Only PSMA5 was later up-regulated at 12 weeks. Therefore, proteasomal changes are specific to certain stages of disease. Global changes to the ubiquitin proteasome system have already been reported in HD at intermediate and late stages of disease at the functional level (61). In this context it is interesting that UBE1X (UBA1) was up-regulated in R6/2 mice at 8 and 12 weeks of age. UBE1X is currently thought to be the sole E1 for charging E2s with ubiquitin in mammals (62,63). E1, E2, and E3 are enzymes that conjugate ubiquitin to cellular proteins in a multistep pathway prior to their degradation (64). In addition UCHL1, which has already been shown to be involved in the pathology of Parkinson disease and exhibits a ligase as well as a ubiquitin hydrolase activity, was down-regulated at 2 weeks of age but up-regulated at 8 and 12 weeks (supplemental Table 1). In summary components of the ubiquitin/ proteasome pathway are up-regulated in symptomatic disease. This may reflect efforts to remove protein aggregates that are a hallmark of HD and other neurodegenerative diseases (65)(66)(67).
Relatively High Degree of Overlap between Early and Late Changes-When comparing changes at 2 and 12 weeks of age many of the protein changes can be observed at both stages (Table VIIB). It is interesting that this overlap exists at the level of pathways as well as at the level of individual proteins (Tables VIII and IX). However, the regulatory pattern of alterations is different at all stages (Tables VIIIB and IX). This may be caused by different types of perturbation in early and late stage disease that is reflected by the differential changes. This is supported by the changes observed in glycolysis/gluconeogenesis that were strictly up-regulated at 2 weeks of age but that were divided (five versus five proteins) at 12 weeks. An early effort to compensate for increased energy requirement early in disease (2 weeks) may be followed by a general deterioration of energy metabolism later prior to death.
No Protein Was Altered in All and Only Two Proteins Were Changed at Four Stages-After comparing the proteins changed at each stage it became obvious that only two proteins, Psd3 and Gapdh, were changed at four of the five stages investigated, and none were altered at all five stages. Interestingly Gapdh has not only been reported to be involved in the pathology of HD (68, 69) but also in Alzheimer and Parkinson diseases (68). The number of proteins changed is therefore of more interest than the protein identities. The number of proteins investigated by a 2-DE proteomics approach is limited to up to 10,000 protein isoform spots depending on the application (18). The total number of proteins of the mouse proteome affected by HD is considerably larger than the number investigated in our study, but the ratio of changed proteins compared to unchanged proteins may remain approximately the same; that is the relative number of protein changes for each time point remains constant between our study and all proteins in the proteome. Therefore this study argues against a model in which there is a gradual increase in the number and magnitude of protein changes during disease progression and suggests a more dynamic regulatory pattern. We suggest that early changes affect late stage disease by changing processes in the mouse brain irreversibly. Our group has already shown that acute and long term proteome changes can be induced by oxidative stress in the developing brain (70). Oxidative stress has been proposed as a pathogenic mechanism for neurodegenerative disease in general. Early energy deficits or structural changes (Table VIII) may act in the same way.
High Correlation between Altered mRNA and Protein Expression on the Pathway Level-In the past, a large number of mRNA expression studies were done for HD mouse models (14,21,22,43,47,71) as well as human tissue (26,47) and blood samples (27). All studies so far used intermediate or late stage disease samples starting for the R6/2 mouse model used in our study at 6 weeks of age (22,24). This means that the two earliest time points, 2 and 4 weeks, used in our study could not be correlated to mRNA expression, and it was not possible to investigate whether the early rise in differential protein expression was mirrored by a rise in altered mRNA expression. Still we found that the correlation between mRNA and protein expression changes was very low in our study on the level of individual proteins and did not exceed 30%, although 88% of the differentially expressed proteins were represented by mRNAs. This is consistent with previous studies in which mRNA data did not necessarily correlate well with protein data (72)(73)(74)(75)(76), although these studies did not investigate changes due to disease. In a recent study using brain cortex tissue, we found a 60 -70% overlap between mRNA and protein changes during embryonic development where processes are highly predetermined and may be regulated to a large degree by mRNA expression with subsequent protein changes (35). Changes from three time points in embryonic development, embryonic days (ED) 9.5, 11.5, and 13.5, were compared with each other. Unfortunately the study investigates only changes up to ED 13.5, ending about 7 days before birth at ED 21 after gestation. Still during disease progression in contrast to development a cell or tissue may have to react to complex, unforeseen, and highly dynamic protein changes that are not predetermined by altered mRNA expression as they are in embryonic development. mRNA changes may be largely the result of pathological processes at the protein level such as protein aggregation or aberrant interactions of Htt that perturb transcription (77,78).
After studying the overlap on the level of individual proteins we tested the pathway overlap between protein and mRNA data. We compared altered pathways at 8 and 12 weeks of the protein data set with 9 and 12 weeks of the mRNA data set (Tables X and XI). We found that the overlap of the pathways involved was actually very high. Therefore it is important to consider functional units such as pathways to measure the overlap between mRNA and protein data sets and not individual proteins or mRNAs.
In addition, a recent mRNA expression study of total brain extracts from R6/2 mice at 12 weeks of age revealed an alteration of 42 transcripts (see Fig. 1A of Ref. 79) that were all down-regulated in expression (79). When comparing the gene names from this study with our protein expression results we found no overlap on the single gene/protein or pathway level. Using the KEGG pathway analysis tool we found that the five most prominently altered pathways in the mRNA data set (79) were all related to neuronal signaling ("neuroactive ligandreceptor interaction" (six mRNAs), MAPK signaling pathway (six mRNAs), "cytokine-cytokine receptor interaction" (six mRNAs), "long term potentiation" (five mRNAs), and "GnRH signaling pathway" (five mRNAs)). Interestingly when considering the mRNA data investigated in this study (Table XI) we found that the MAPK signaling pathway was altered in both mRNA studies. Upon a more detailed inspection of the KEGG analysis of mRNA data for 12 weeks (data not shown), neuroactive ligand-receptor interaction (39 mRNAs), cytokine-cytokine receptor interaction (34 mRNAs), long term potentiation (22 mRNAs), and GnRH signaling pathway (25 mRNAs) were also altered on the mRNA level in our study. The most likely explanation for the discrepancy between mRNA and protein data is that most altered signaling mRNAs found in the study of Luthi-Carter et al. (79) are proteins that are either hydrophobic (receptors) and/or low in abundance (ligands) and can therefore not be detected by 2-DE. Importantly it would be of tremendous value to study mRNA and protein expression in the same tissue and time points in parallel because only then will we be able to determine the true correlation between differential mRNA and protein expression overlap without confounding influences such as different time points investigated (8 versus 9 weeks) or brain regions (total brain versus striatum).
In summary, we used a large 2-D gel/mass spectrometrybased proteomics approach to investigate HD-induced protein expression alterations and their kinetics prior to the onset of phenotypes and during the course of disease. Unexpectedly we found that protein changes were largely stage-specific (71-100%), and a drastic alteration (almost 6% of the proteome) in protein expression that occurred as early as 2 weeks of age predominantly included up-regulation of glycolysis/gluconeogenesis and down-regulation of the actin skeleton. This suggests a period of highly variable protein expression that precedes the visible HD phenotype. Although an up-regulation of glycolysis/gluconeogenesis-related protein alterations remained dominant during HD progression, late stage alterations at 12 weeks showed an up-regulation of proteins having proteasomal function. Our observations suggest that HD is characterized by a highly dynamic disease pathology not represented by linear protein concentration alterations over the course of the disease. Detailed time course studies for disease progression with emphasis on early and very early stages are important to understand disease pathology and determine the time for intervention when considering HD therapy.