Avoiding Pitfalls of Internal Controls: Validation of Reference Genes for Analysis by qRT-PCR and Western Blot throughout Rat Retinal Development

Background Housekeeping genes have been commonly used as reference to normalize gene expression and protein content data because of its presumed constitutive expression. In this paper, we challenge the consensual idea that housekeeping genes are reliable controls for expression studies in the retina through the investigation of a panel of reference genes potentially suitable for analysis of different stages of retinal development. Methodology/Principal Findings We applied statistical tools on combinations of retinal developmental stages to assess the most stable internal controls for quantitative RT-PCR (qRT-PCR). The stability of expression of seven putative reference genes (Actb, B2m, Gapdh, Hprt1, Mapk1, Ppia and Rn18s) was analyzed using geNorm, BestKeeper and Normfinder software. In addition, several housekeeping genes were tested as loading controls for Western blot in the same sample panel, using Image J. Overall, for qRT-PCR the combination of Gapdh and Mapk1 showed the highest stability for most experimental sets. Actb was downregulated in more mature stages, while Rn18s and Hprt1 showed the highest variability. We normalized the expression of cyclin D1 using various reference genes and demonstrated that spurious results may result from blind selection of internal controls. For Western blot significant variation could be seen among four putative internal controls (β-actin, cyclophilin b, α-tubulin and lamin A/C), while MAPK1 was stably expressed. Conclusion Putative housekeeping genes exhibit significant variation in both mRNA and protein content during retinal development. Our results showed that distinct combinations of internal controls fit for each experimental set in the case of qRT-PCR and that MAPK1 is a reliable loading control for Western blot. The results indicate that biased study outcomes may follow the use of reference genes without prior validation for qRT-PCR and Western blot.


Introduction
Gene expression analyses are crucial for the discovery and characterization of the roles for known genes [1]. Concerning the study of the development of different tissues, these analyses can provide insights into complex regulatory networks that coordinate proliferation, cell commitment, differentiation and apoptosis. Considering the complexity of the neural tissue a major issue is the standardization of quantitative approaches to investigate expression patterns in response to specific treatments or throughout development. In this study, we addressed this question focusing on retinal development.
The retina is derived from the diencephalon, and is responsible for the conversion of electromagnetic energy into nerve impulses [2]. Vertebrate retinas are composed of seven major cell types that are produced from multipotent progenitor cells [3,4]. During development, these progenitors expand through cell proliferation, commit to distinct cell types and exit the cell cycle to generate either retinal neurons or the Müller glia in an evolutionary conserved birth order [5]. Two types of photoreceptors are responsible for phototransduction and while cones are involved in photopic and color vision, rods are responsible for scotopic vision [2]. Photoreceptors signal to bipolar cells and these to ganglion cells, while information is laterally processed through horizontal and amacrine cells [2]. Retinal ganglion cells carry the visual input to the brain through the optic nerve [2] .
Quantitative real-time reverse transcription-polymerase chain reaction (qRT-PCR) and Western blot are widely used to quantify RNA and protein content, respectively. qRT-PCR is highly sensitive, allowing the quantification of rare transcripts. It has high specificity, good reproducibility, and a wide dynamic range [6,7]. Western blot is a semi-quantitative method used to identify individual proteins in complex protein extracts. It has high specificity due to antigen-antibody interaction, which can be verified by checking the expected molecular weight. Signal amplification is obtained through the use of primary and secondary antibodies [8]. Experimental errors can, however, be introduced at multiple steps in both protocols due to variability in the amount of starting material, extraction and pipetting [9]. In the case of RNA, additional variation is found owing to the efficiency of reverse transcription or the amount of input template, whereas transfer efficiency is a recurrent problem in Western blot. So, it is essential to account for experimental variance as well as biological differences when conducting gene expression studies.
The ratio between target and internal control is used to standardize independent biological samples [7]. The stability of expression of an internal or loading control is required for accurate and reliable normalization in both qRT-PCR experiments and Western blot [7,10]. The reference genes are typically selected because of their role in key biological pathways, ubiquitous expression and similar expression levels among all samples. However, several so-called housekeeping genes commonly used as reference can be dynamically expressed either in response to treatments or throughout development [11,12].
It is likely that many constitutive genes are not stable during retinal development. For example, expression of both b-actin (Actb) and glyceraldehyde-3-phosphate-dehydrogenase (Gapdh) were shown to vary with tissue maturation [13,14,15,16,17,18,19]. Indeed, a proteomics study of the developing mouse retina identified a large set of proteins whose expression is altered throughout development, including b-actin and tubulin a-1 chain [12]. Therefore, selection of up-or downregulated genes as reference during retinal maturation may affect statistical parameters such as power and sample size [20,21]. Proper verification of suitable endogenous controls would prevent inadequate quantification and spurious findings. This is particularly relevant in the case of genes that suffer subtle changes throughout development or under experimental conditions. Many studies have been carried out on animal and plant samples to describe reference genes for normalization [1,10,11,22,23,24]. Several tools have been developed to identify the most stably expressed genes in a specific setup, but none is universally accepted. Here, we report the validation of suitable internal controls for expression analysis by qRT-PCR during development of the rat retina, with the use of geNorm [1], BestKeeper [9] and NormFinder [25] software. The seven genes tested in this study were chosen based on its previous use as reference genes [26,27,28,29,30] and also because of its involvement in diverse cellular processes, which reduces the probability of co-regulation. This is the first in-depth study to validate internal controls for expression analysis throughout retinal development. The relevance of this evaluation was illustrated by changing results of cyclin D1 RNA measurements when inappropriate reference genes were used for normalization. In addition, we validated loading controls for Western blot analysis across the same sample panel.

Validation of loading control for Western blot
To identify an internal control with stable expression throughout retinal development, we characterized the protein content of two commonly used housekeeping genes, b-actin (ACTB) and atubulin; MAPK1 (ERK2), previously used as loading control by our group [27,31]; Cyclophilin B and Lamin A/C. Lamin A/C and Cyclophilin B were not detected in all stages of development (data not shown). Protein extracts were sampled at least three times from each developmental stage from embryonic (E) to postnatal (P) days: E18, P1, P4, P10, P14 and P45 ( Figure 1A). Furthermore, to analyze the variation of expression among the three candidates detected in all stages, immunoblot was performed in the same membrane for each biological replicate after stripping, so that loading errors would not mask the results.
b-actin showed an evident variation during development with a Coefficient of variation (Cv) of 36.6%. b-actin protein content at E18 was significantly higher than at P45 (109.164. 8 Figure 1B).
In contrast, a-tubulin showed less variation with a Cv of 24.5%, but could be used properly as an internal control only if there is no intent to evaluate the expression at mature stages. This is because the content of a-tubulin at P45 was significantly different than at P4 (P45 59.665.5 versus P4 99.968.3, p,0.05). Although a tendency is also observed when E18 is compared to P4 and P10, the difference was not statistically significant. Similarly, this is observed when P45 is compared to P10. ( Figure 1B). MAPK1 protein content was slightly variable between the ages analyzed and the Cv was approximately two-fold lower than bactin, 18.5% ( Figure 1B). Thus, MAPK1 is, among the proteins tested, the most reliable loading control, although this gene is not commonly recognized as a housekeeping gene [11]. For each sample, qRT-PCR was done for seven candidate reference genes (Table S1) with technical triplicates. For each of the two biological replicates, RNA was extracted from 2-3 pooled individuals, so that the RNA could be considered an average sample of the developmental stage analyzed. The distribution of Ct data among the 12 samples is shown in Figure 2.
For geNorm use, relative values calculated by 2 -DCT method were imported and the analysis provided a measure of gene expression stability (M) for each reference gene from least (highest M value) to most stable (lowest M value) ( Figure 3; Table 1). geNorm also generated other two outcomes, the pairwise variation value V n/n+1 and the effects of step wise inclusion of the next most stable reference gene on V n/n+1 ( Figure 4).
First, we analyzed the expression stability for Group 1 (E18, P1 and P4). Gapdh, B2m and Mapk1 showed M values below the theoretical threshold of 0.5, indicating adequate gene stability ( Figure 3; Table 1). M value increased moderately for Actb, whilst Ppia, Rn18s and Hprt1 showed higher variability with an increased slope of M value curve. Importantly, Rn18s, which is commonly used as an internal control, showed the second largest M value. The optimal number of reference genes for use in standardization can be deduced from pairwise variation. A value from 0.15 to 0.20 is generally considered as an appropriate cutoff for the pairwise variation, although this should be regarded as a reference rather than a strict value [1,32,33]. Below this threshold the addition of an extra reference may not result in significant improvement, and even lead to reduced average expression stability. In Group 1, the pairwise variation V 2/3 yielded a value of 0.138, which was already below 0.15 ( Figure 4). This means that, based on geNorm, a combination of Gapdh and B2m is stable enough to standardize Group 1, without the need to add Mapk1. The stability of the candidates was further analyzed by BestKeeper and Normfinder software. The coefficient of variance calculated by BestKeeper (Table 2) based on Ct values of Group 1 samples led to the same ranking as geNorm (Table 1). On the other hand, Normfinder, which takes into account both intra-and intergroup variations for calculation of the normalization factor, pointed at Mapk1 and B2m as the best combination of reference genes (Table 3).
For Group 2 (P1, P4 and P10), Actb, Mapk1, Rn18s and Gapdh had M values below the theoretical threshold of 0.5 ( Figure 3). M value for B2m is 0.535 so it was not a good candidate for Group 2 normalization. These data reinforce that each experimental setup may have its best combination of reference genes. Pairwise variation V 2/3 achieved a value of 0.116 ( Figure 4). Therefore, based on geNorm, Actb and Mapk1 are stable enough to standardize Group 2. BestKeeper outcome showed that Actb and Mapk1 had the lowest coefficient of variance (Table 2). On the other hand, Normfinder pointed at Gapdh and Mapk1 as the best combination of reference genes (Table 3).
Gapdh, Mapk1 and Rn18s showed M values below the theoretical threshold of 0.5 for Group 3 (P1, P4, P10, P14, P45) ( Figure 3). Pairwise variation V 2/3 had a value of 0.163 ( Figure 4). Thus, geNorm would recommend the use of the two most stable genes Gapdh and Mapk1. In addition, Bestkeeper and Normfinder confirmed the stability of both genes ( Table 2 and Table 3).
Analysis of Group 4 (P10, P14 and P45) showed discrepancies among the softwares. geNorm picked Gapdh and Rn18s as the most stable reference genes ( Figure 3; Table 1). Moreover, Mapk1 and Ppia showed M values below 0.5 and Pairwise variation V 2/3 had a value of 0.156 (Figure 3 and 4). Bestkeeper calculated the lowest coefficient of variance for Mapk1 and Gapdh, while Normfinder chose Gapdh and Ppia as most stable genes ( Table 2 and Table 3). It is important to note that Rn18s and Ppia had the highest coefficients of variance ( Table 2).
The same analysis was performed for all samples together (Group 5-all ages: E18, P1, P4, P10, P14, P45), so that we could compare the expression levels of target genes within a broader range of ages. Gapdh and Mapk1 were ranked by all three software as the most stable genes to normalize Group 5 ( Figure 3; Tables 1, 2, 3), but only V 4/5 geNorm pairwise variation achieved an appropriate value of 0.16 ( Figure 4). Therefore, although an agreement on the two more stable genes were observed for all programs, based on geNorm the use of the four most stable genes would be recommended.
Finally, we assessed the effect of a blind selection of reference genes on the normalization of cyclin D1 expression. Using Group  5, which contains all ages tested, normalization against the most variable candidates (B2m and Hprt1) led to a dramatic increase in standard deviation and discrepancies in cyclin D1 expression pattern ( Figure 5A) when compared to the normalization with the 4 most stable genes, as recommended by geNorm ( Figure 5B) or with Mapk1 and Gadph, which were indicated as the most stable genes by geNorm, Normfinder and Bestkeeper ( Figure 5C). These last two conditions presented very similar results (Figures 5B and  5C). Our results led us to recommend the internal controls indicated in Table 4 (see details in Discussion).

Discussion
The ideal control gene should have similar expression regardless of experimental conditions, such as: developmental stages, composition of cell types, and/or sample treatments. This applies both for Western blot and qRT-PCR. Indeed, this problem has been addressed for gene expression studies using qRT-PCR [1,22,23,24], but few studies have already characterized reliable loading controls for Western blot [10]. Adequate selection of reference genes is critical for sensitive and accurate quantification of mRNA or protein content, especially for those genes whose transcript and protein levels are low.
In the present study, we demonstrated a significant decrease in b-actin protein content along retinal development, although it is a housekeeping gene frequently used as an internal control [26,34,35] (Figure 1A, B). a-tubulin showed a coefficient of variance lower than b-actin, but a significant difference was still found between P45 and P4, and a tendency was identified when E18 is compared to P4 and P10 although the difference was not statistically significant ( Figure 1B). For standardization of a broader range of developmental ages from embryonic to mature retina, MAPK1 was the most stable choice (Figure 1), as empirically observed in our laboratory [27]. In spite of the lack of prior evidence on constitutive expression of MAPK1, there were not significant differences in MAPK1 protein content throughout retinal development. In conclusion, our findings highlight a critical problem in previous investigations that used b-actin for normalization, and may be helpful for further studies on retinal development.
Using geNorm, BestKeeper and Normfinder algorithms, we were able to find the best combination of reference genes for qRT-PCR of various groups of samples. NormFinder is an algorithm for the identification of the optimal pair of reference genes out of a group of candidates. This software uses information about expression stability, as well as the variation between sample  subgroups to examine each gene independently and test combinations of gene pairs to compensate the variability of the system [25]. geNorm selects the best two internal control genes with similar intergroup variation [1], whereas BestKeeper calculates the coefficient of variance of each putative reference gene, which is defined as a percentage of the average Cycle threshold (Ct) level [9]. The minimum number of endogenous control required for normalization in gene expression studies is a major aspect in debate. Hence, it is worth considering a balance between the absolute gain in statistical power and the extra cost and effort when using additional reference genes. In this study, NormFinder was the software considered as the most reliable choice because it takes into account both intra-and inter-group variations for normalization. However, geNorm and BestKeeper were also relevant to elect internal controls for each experimental set. Normfinder pointed Mapk1/B2m and Gapdh/ Mapk1 as the best combination to normalize Group1 and Group 2, respectively (Table 4). In both cases, the two reference genes recommended by Normfinder were top ranked by the other programs. Gapdh and Mapk1 were selected for Group 3 normalization by all software and showed a geNorm Pairwise variation V 2/3 value of 0.163. As Pairwise M cutoff values of 0.20 to 0.15 are suggested [1,32,33], we conclude that there is no need to add a third reference gene (Table 4). geNorm, Normfinder and Best-Keeper showed divergent results for Group 4, as expected because they are based on distinct statistical algorithms. Our conclusion is that Gapdh/Ppia would work well to normalize Group 4 target expression based on Normfinder algorithm. One can argue that Ppia and Rn18s showed the highest coefficients of variance, but it is important to notice that Bestkeeper software does not take into account the stability of a combination of genes. Due to the discrepancies mentioned above and the higher reliability of Normfinder algorithm as stated above, we advise the use of Gapdh/Ppia.
The more complex is the experimental set, harder it becomes to find stably expressed genes. When the same 7 genes were tested for Group 5, which includes all ages tested, Gapdh and Mapk1 were ranked by all three software as the two more stable reference genes for normalization ( Figure 3, Table 1, 2, 3). When geNorm is applied, the use of the four most stable genes is recommended, since the Pairwise variation V 4/5 value was 0.16 ( Figure 4). Mapk1 and Gapdh proved to be reliable internal controls for normalization of Group 5, as low sample variance and the same pattern of expression was obtained when they were used to analyze the expression of cyclin D1 ( Figure 5B), compared to the use of the combination of the four reference genes indicated by geNorm ( Figure 5C). Mapk1 and Gapdh showed robust constitutive expression throughout retinal development. Mapk1 presented a coefficient of variation 30% lower than, for example, the most stable gene investigated in mouse myocardial infarction models [21]. Alternatively, if there is no intention to include the analysis of the embryonic stage, the combination of Gapdh and Ppia would be the most reliable choice.
When searching for reference genes commonly used to compare different stages of retinal development in rodents, it is common to find the use of only one [26,36,37], or at most two reference genes [28,38] for qRT-PCR. The most frequent reference genes used are Actb or Gapdh, but Rn18s, Hprt1 and Prkca are also described [26,28,36,37,39]. Consistent with our data, a single cell study observed an expressive variation on the expression of housekeeping genes among progenitor cells isolated from different stages of retinal development [40].
In conclusion, we demonstrated for the first time how relevant it is to validate a reference gene set suitable for expression studies on rat retinal development. Our results indicate combinations of genes for qRT-PCR analysis for different combinations of developmental stages and MAPK1 as the loading control for Western blot.
We furthermore advise against the use of b-actin for both methods particularly when a long range of developmental stages are analyzed (exemplified by Group 5 in this study, which range from E18 to P45), because this gene is downregulated during retinal development. Given the risk of substantial variation on gene expression among distinct animal models, we encourage the validation of reference genes as an initial and essential step in quantitative studies of either mRNA (qRT-PCR) or protein content (Western blot). This is particularly relevant when comparing tissue extracts from various stages of development, due to the variety of cellular processes modulated throughout morpho-and histogenesis.

Materials
Dulbecco's Modified Eagle Medium (DMEM), UltraPure DNase/RNase-Free water and Trizol were purchased from Invitrogen (Calsbad, CA, USA). First-strand cDNA synthesis kit was purchased from GE Healthcare (Little Chalfont, UK). DNA- Table 2. Bestkeeper ranking for each group of samples with mean 6 SD (standard deviation) and coefficient of variance (CV) of threshold cycle (Ct) values. Group  . Secondary antibodies linked to horseradish peroxidase (HRP) were from Cell Signaling (Beverly, MA, USA). All information about the primary antibodies used is described in Table S2.

Samples
All experimental procedures with animals for this study were approved by the Ethics Committee on Animal Experimentation of the Health Sciences Center of the Universidade Federal do Rio de Table 3. NormFinder analysis with reference genes ranked by their expression stability or different experimental sets.   Embryos were removed from the uterus of pregnant rats euthanized in a carbon dioxide chamber. While the same procedure was performed to kill adult rats, pups and embryos were killed by instantaneous decapitation. Each of the two biological replicates of RNA was extracted from 2-3 pooled individuals. Therefore, the RNA obtained from each biological replicate could be considered an average sample of the developmental stage analyzed [23,41,42,43,44,45]. For western blot analysis at least three independent biological replicates were used.

mRNA extraction and cDNA synthesis
Each pool of retinas of rats at the ages E18, P1, P4, P10, P14 and P45 was washed once with PBS and RNA was extracted with Trizol following manufacturer's instructions. RNA integrity was confirmed by visualization of RNA 18S and 28S in 1% agarose gel electrophoresis. RNA was treated with DNA-free kit following manufacturer's instructions, quantified with NanoDropTM Spectrophotometer ND-1000 (Thermo Scientific) and stored at 280uC. Quantity and quality of RNA extracted were assessed to confirm good RNA yields and purity with a mean A260/A280 ratio of 1.960.2. DNA contamination was ruled out by standard PCR and agarose gel electrophoresis. cDNA was synthesized from 1 mg of RNA with pd(N)6 random primers, as described in kit manual (First Strand cDNA Synthesis Kit, Amersham).

Primer Design, specificity and efficiency
Primers for qRT-PCR were designed with Primer Quest (Integrated DNA Technologies SciTools) with the following criteria: product size ranging from 80 to 285 bp, optimum Tm of 60uC and GC content about 50%. Secondary structures and primer-dimers were avoided. Rn18s primers were from Ambion (Austin, TX, USA). Standard RT-PCR confirmed that the primers amplified only a single product with expected size (data not shown). Primer efficiency was calculated for qRT-PCR using the slope of the calibration curve according to the equation: E = 10 [21/slope] [7]. All information about the primers was included in Table S1.

Quantitative RT-PCR (qRT-PCR)
For qRT-PCR reactions were carried out in an optical 96-well plate in ABI7500 (Applied Biosystems). The primers used for quantitative PCR analysis are listed in Table S1. Control without reverse transcriptase was performed to ensure that the results were not due to amplifications of genomic DNA. PCR product identity was confirmed by melting-point analysis. Each reaction contained 12.5 mL of SYBR Green 26 reaction mix, 2 mL of diluted cDNA (1:65), 100 nM of each primer (0.5 mL each) and 9.5 mL of UltraPure DNase/RNase-Free water (Invitrogen). Conditions used were: 50uC/2 min; 95uC/10 min and 45 amplification cycles of 95uC/15 s; 60uC/60 s. In order to reduce confounding variance, two independent biological samples from different littermates were analyzed in technical triplicates. Technical replicates were averaged before all software analysis.

Reference gene expression stability and statistical analysis
Expression levels of seven putative housekeeping genes in all the samples were determined in the exponential phase by the number of cycles necessary to reach the prior established threshold (Ct) ( Table S1). The Ct values were converted to a linear scale by the equation: 2 2DCT [46]. These data were used as the input to verify the expression stability at geNorm v3.5, BestKeeper and Normfinder tools, following instructions [1,9,25]. geNorm algorithm is based on the geometric averaging of multiple control genes and mean pairwise variation of a gene from all other control genes in a given set of samples. This algorithm also calculates the pairwise variation value, which indicates the optimal number of control genes to be used, and cutoff values of 0.20 [33] to 0.15 [1,32] have been suggested. NormFinder takes into account intra-and intergroup variations for normalization factor (NF) calculations [25]. The expression values of Cyclin D1 were assessed to test the efficacy of the selected endogenous controls. The comparative Ct method (DDCt) was used to determine the target quantity in sample as compared with the mean of different reference genes in combination and relative to a calibrator [(Ct target gene -Ct reference gene ) sample -(Ct target gene -Ct reference gene ) calibrator ]. A similar mathematical correction similar to the one of the software qBase, which is based on the use of the average of DCt of all groups (in this case: E18, P1, P4, P10, P14 and P45), was applied to define the calibrator [47].

Western blots
Dissected retinas from 3 different littermates for each stage were washed with PBS and total protein was extracted (10 mM Tris base-HCl, 150 mM NaCl, 1% NP-40, 1% Triton X-100, 5 mM EDTA, 0.1% SDS, 1% sodium deoxicholate, 1 mM phenylmethylsulfonyl fluoride, 1 mg/ml aprotinin, 1 mg/ml pepstatin, 1 mg/ml leupeptin, 1% sodium orthonovadate and 50 mM sodium fluoride). The concentration of lysates was determined by the Lowry assay [48]. Lysates (30 mg) were separated in SDSpolyacrylamide gels and transferred to nitrocellulose membranes. Membranes were blocked with 5% non-fat dry milk and incubated with primary antibodies to five putative constitutively expressed proteins, listed in Table S2, followed by horseradish peroxidaseconjugated secondary antibodies. Immunoblots were developed with Luminata (Millipore), according to manufacturer's instructions, and densitometric results were analyzed with Image J software. Coefficients of variance were calculated by the ratio between standard deviation and mean. Stripping was performed

Statistical analyses
For Western blot, statistical comparisons between more than two experimental groups were made with one-way anova tests followed by Bonferoni multiple comparisons test. Results are reported as mean 6 standard error of the mean (SEM), and p was set to 0.05. For all analyses, prism 4.0 software (GraphPad Software, San Diego, CA, USA) was used.

Supporting Information
Table S1 Primer sequences for seven putative endogenous control genes and target gene. Detailed description of all genes tested, primer pairs' sequences, PCR conditions and primers efficiencies. (TIF)