Selection and Evaluation of Reference Genes for Expression Analysis Using qRT-PCR in the Beet Armyworm Spodoptera exigua (Hübner) (Lepidoptera: Noctuidae)

Quantitative real-time PCR (qRT-PCR) is a reliable and reproducible technique for measuring and evaluating changes in gene expression. The most common method for analyzing qRT-PCR data is to normalize mRNA levels of target genes to internal reference genes. Evaluating and selecting stable reference genes on a case-by-case basis is critical. The present study aimed to facilitate gene expression studies by identifying the most suitable reference genes for normalization of mRNA expression in qRT-PCR analysis of the beet armyworm Spodoptera exigua (Lepidoptera: Noctuidae). For this purpose, three software tools (geNorm, NormFinder and BestKeeper) were used to investigate 10 candidate reference genes in nine developmental stages and five different tissues (epidermis, head, midgut, fat body and hemolymph) in three larval physiological stages (molting, feeding and wandering stages) of, S. exigua. With the exception of 18S ribosomal RNA (18S), all other candidate genes evaluated, β-actin1(ACT1), β-actin2 (ACT2), elongation factor1(EF1), elongation factor 2 (EF2), Glyceralde hyde-3-phosphate dehydrogenase (GAPDH), ribosomal protein L10 (L10), ribosomal protein L17A (L17A), superoxide dismutase (SOD), α-tubulin (TUB),proved to be acceptable reference genes. However, their suitability partly differed between physiological stages and different tissues. L10, EF2 and L17A ranked highest in all tissue sample sets. SOD, ACT2, GAPDH, EF1 and ACT1 were stably expressed in all developmental stage sample sets; ACT2, ACT1 and L10 for larvae sample sets; GAPDH, ACT1 and ACT2 for pupae and adults; SOD and L17A for males; and EF2 and SOD for females. The expression stability of genes varied in different conditions. The findings provided here demonstrated, with a few exceptions, the suitability of most of the 10 reference genes tested in tissues and life developmental stages. Overall, this study emphasizes the importance of validating reference genes for qRT-PCR analysis in S. exigua.


Introduction
Quantification of gene expression levels is fundamentally important for identifying genes relevant to biological processes [1] and provides insights into complex regulatory networks. Quantitative real-time PCR (qRT-PCR) [2,3] is one of the most reliable and reproducible techniques available to measure and evaluate changes in gene expression [4], which is often used to confirm or refute interpretations of relative gene expression profiles derived from high-throughput systems [4,5]. The qRT-PCR technique is sensitive enough to detect subtle alterations in gene expression, even for those with fairly low transcript levels [6,7]. Although this powerful technique is often described as the gold standard, results are inevitably affected by different experimental conditions, such as different amounts of starting material, quality and integrity of template RNA samples, reverse transcription efficiency, recovery and integrity of mRNA, primer design and transcription efficiency [8]. Additionally, random pipetting errors can add technical variability to the data [9,10]. As these factors can potentially render the quantification of gene transcripts unreliable, having a robust system for normalization of qRT-PCR data is essential to avoid non-specific variations or errors [6,11]. The most common method for normalizing gene expression levels is to compare mRNA levels of the genes of interest to those of endogenous control genes, which are often called housekeeping or reference genes.
Ideal reference genes should not be regulated or influenced by the experimental procedure or co-regulated with the target gene. They should also be expressed in abundance and have minimal innate variability [12]. However, the indiscriminate use of some internal reference genes is questionable, since their expression levels are regulated according to cellular conditions [13][14][15]. Several studies have shown that this approach can introduce large errors when the expression of such ''housekeeping genes'' varies under different treatments and in different tissues [16,17].
The beet armyworm, Spodoptera exigua (Lepidoptera: Noctuidae), is a widespread and polyphagous lepidopteran pest that causes severe economic damage in both dicotyledon (e.g., sugar beet, alfalfa, cotton, chrysanthemum) and monocotyledon (e.g., rice) crops and flower species. Molecular studies have been widely conducted previously in S. exigua [18][19][20], including investigations of insecticide resistance [21][22][23] and the role of important genes in physiological processes of the insect [24]. Understanding the function of important regulatory genes at the molecular level is essential for pest control. The molting, feeding and wandering stages are three larval physiological stages of the Lepidoptera larvae, which are regulated by different specific hormone levels. Tissues and genes in these three larval physiological stages have shown significant differences [25][26][27]. Therefore, molecular studies directed towards the three larval physiological stages are at the center of Lepidoptera physiological research. The fat body is a major tissue found to play an important role in the metabolism and detoxification of xenobiotics in insects [28,29]. Receptors involved in insecticide resistance have been found in the midgut [30,31], and several antiviral proteins and genes of potential value in clinical medicine were found in the epidermis and hemolymph [32]. Exploring gene expression profiles in these tissues will help our understanding of the regulation of the three larval physiological stages and facilitate application of useful resources to control the pest. Several genes have been demonstrated to be differentially expressed in some tissues based on sex [33], and some genes related to insecticide resistance, such as P450, are regulated by female mating [34]. Some studies have also reported differences in expression of the sex pheromones of S. exigua [35].
To date, studies have been published on evaluating the stability of reference genes in some insects [36,37,38]. Teng et al. [39] chose 4 candidates and screened the relatively stable reference genes in 4 lepidopteran insect species including S. exigua. Several studies have shown that each candidate reference gene should be evaluated under specific experimental conditions for gene expression profiling to ensure that expression occurs at a constant level [40]. The evaluation and selection from just four candidate genes seemed to be insufficient. Therefore, ten commonly used reference genes b-actin1(ACT1), b-actin2 (ACT2), elongation factor1(EF1), elongation factor 2 (EF2), Glyceralde hyde-3-phosphate dehydrogenase (GAPDH), ribosomal protein L10 (L10), ribosomal protein L17A (L17A), superoxide dismutase (SOD), a-tubulin (TUB), 18S ribosomal RNA (18S) from S. exigua were tested and their effectiveness for the normalisation of expression studies were further validated by quantitative analysis of a well-studied target diapause-specific peptide (DSP) gene. Three available and commonly used tools (geNorm, NormFinder and BestKeeper) were used to determine a set of the most stably expressed genes in different developmental stages (egg, 1 st larvae, 2 nd larvae, 3 rd larvae, 4 th larvae, 5 th larvae, prepupae, pupae and adult), in both sexes of pupae and adults, as well as in five different tissues (epidermis, head, midgut, fat body and hemolymph) and three larval physiological stages (molting stage, feeding stage and wandering stage) of S. exigua. The objectives of this investigation were (i) to provide appropriate reference genes to develop an accurate and comprehensive qRT-PCR method for use in S. exigua studies, and (ii) to assess the importance of variations in relative quantification among normalization strategies in different developmental stages, sexes, larval physiological stages and tissues, with a focus on the merits of using multiple versus single reference genes in different studies.

Insects
S. exigua were reared on an artificial diet [41] at 2761uC (14L: 10D). Pupae were selected and sexed on the third day. Adult males and females were allowed to emerge in transparent containers and fed with a 5% honey solution.

Sample collection
The stability of candidate genes was tested in different S. exigua samples of (i) five different tissues in three larval physiological stages, (ii) different developmental stages and (iii) two sexes. Only the tissue samples in three larval physiological stages had been dissected individually and all other samples were whole body. For each of the different sample groups, three replicate cages were used.

Sampling of different tissues in three larval physiological stages
For this study, the beet armyworms were synchronized in the 4 th larval molting stage, 5 th larval feeding stage (48 h post-molting) or 5 th larval wandering stage (96 h post-molting), and then larvae in the three larval physiological stages were dissected individually using a dissection needle in physiological saline. The epidermis, head, midgut, fat body and hemolymph were collected separately. The collected tissues were quickly frozen and homogenized immediately after dissection with liquid nitrogen in a mortar and used for RNA extraction.

Samples of different sexes
Male pupae (10 per pool), female pupae (10 per pool), male adults (10 per pool) and female adults (10 per pool), were collected separately and placed in 1.5 ml centrifuge tubes.

Selection of gene sequences and primer design
PCR primer sequences used for quantification of the 10 candidate genes are shown in Table 1. The secondary structure of the DNA template was analyzed with UNAFold [42] using the mfold web server (http://mfold.rna.albany.edu/?q = mfold/DNA-Folding-Form) [43] with the following settings: melting temperature, 60uC; DNA sequence, linear, Na + concentration, 50 mM; Mg ++ concentration, 3 mM. Other parameters were set by default. The primers were designed using NCBI Primer-BLAST (http:// www.ncbi.nlm.nih.gov/tools/primer-blast/index.cgi?LINK_LOC = BlastHome), with the settings: primer melting temperature, 60uC; primer GC content, 40-60%; and PCR product size, 80-200 base pairs. The excluded regions were based on results of analysis by mfold, and other parameters were set by default.

Total RNA isolation and cDNA synthesis
All collected samples were preserved in microcentrifuge tubes (1.5 ml) and stored at 280uC after being frozen in liquid nitrogen. Subsequently, three total RNA samples were prepared for each sample set using the SV Total RNA Isolation System (Promega, USA). According to the protocol of the kit, total RNA was incubated for 15 minutes at 20-25uC after adding 5 ml DNase I enzyme (Promega, USA). The purified RNA was stored at 280uC before further processing. The quality and quantity of RNA were assessed with a UV-1800 spectrophotometer (Shimadzu, Japan). cDNA was produced using the PrimeScript 1st Strand cDNA Synthesis Kit (TAKARA, Japan) in a total volume of 20 ml, with 4 ml 56PrimeScript Buffer,1 mg of total RNA, 1 ml oligo dT primer, 1 ml PrimeScript RTase (200 U/ml), and 0.5 ml RNase Inhibitor (40 U/ml). Following the manufacturer's protocol, the 20 ul mixture was incubated for 60 minutes at 42uC. No-template and no-reverse transcription (no-RT) controls were run for each reverse transcription run for the control treatment. cDNA was stored at220uC until used.

qRT-PCR
Triplicate first strand DNA aliquots for each sample served as templates for qRT-PCR using SoFast TM EvaGreenH Supermix (Bio-Rad, USA) on an iQ2 Optical System (Bio-Rad). Each amplification reaction was performed in a 20 ml total volume with 1 ml of cDNA and 100 nM of each primer in an iQ TM 96-well PCR plate (Bio-Rad), which was covered with Microseal ''B'' adhesive seals (Bio-Rad). Thermal cycling conditions included initial denaturation at 95uC for 30 s, followed by 40 cycles of 95uC for 5 s and 60uC for 10 s. After all reactions, a melting curve analysis from 65 to 95uC was applied to ensure consistency and specificity of the amplified product. A 10-fold dilution series of cDNA from the whole body of adults was employed as a standard curve, and the qRT-PCR efficiency was determined for each gene and each treatment with the slope of a linear regression model [44]. The corresponding qRT-PCR efficiencies (E) were calculated according to the equation: E = (10 [21/slope] 21)6100 [45].

Stability of gene expression
The stability of candidate genes was evaluated by three commonly used software tools, BestKeeper [46,47], geNorm (http://medgen.ugent.be/˜jvdesomp/genorm/) [48] and Norm-Finder (http://www.mdl.dk/publicationsnormfinder.htm) [49]. The Excel based tool Bestkeeper, is able to compare expression levels of up to ten housekeeping genes together with ten target genes, each up to hundred biological samples. The raw data of cycle threshold (Ct) values(CP values) and PCR efficiency (E) of the candidate genes were used to determine the best-suited standards by BestKeeper. The underlying principle for identification of stably expressed reference genes by Bestkeeper is that the expression levels of suitable reference genes should be highly correlated. Therefore, the correlation between each candidate and the index is calculated, describing the relation between the index and the contributing candidate reference gene by the coefficient of determination and the P value [46].Ct values converted to linear values (the lowest relative quantity for each gene was set to 1) were used as input data for subsequent analyses with geNorm and NormFinder. Similar with Bestkeeper, the key principle of geNorm is that the expression ratio of two suitable reference genes should be constant across samples. geNorm algorithm first calculates an expression stability value (M) for each gene and then compares the pairwise variation (V) of this gene with the others. Using microarray data as a training set for the algorithm, the value of Vn/Vn+1 indicates the pairwise variation between two sequential normalization factors and determines the optimal number of reference genes required for accurate normalization. A value below 0.15 indicates that an additional reference gene will not significantly improve normalization. Reference genes are ranked according to their expression stability by a repeated process of stepwise exclusion of the least stably expressed genes [48]. NormFinder provides a stability value for each gene which is a direct measure for the estimated expression variation enabling the user to evaluate the systematic error introduced when using the gene for normalizsation [49]. Every gene was ranked by the three software tools and assigned an appropriate weight separately. The final ranking was established after calculating the geometric mean of their weights.

Evaluation of target gene expression
DSP of S. exigua was used as a target gene to evaluate the candidate reference genes. Normalized with different reference genes, relative quantification of DSP in different samples were conducted according to threshold cycle (Ct) value based on 2 2ggCt method.

Amplification efficiencies
The initial screening of 10 candidate reference genes and one target gene by PCR showed that all of the genes were expressed in all S. exigua sample sets, as indicated by the presence of a single amplicon of the expected size on a 2% agarose gel. In order to determine the amplification efficiency of all 11 genes in the study, 5-point standard curves with known concentrations of transcribed reference RNA were made. All amplification efficiencies in the qRT-PCR analysis for the 10 candidate genes and one target gene ranged between 98.5,111.4% compared with the templates from which the primers were designed. Linear regression coefficients (r 2 ) for all 11 genes were $0.990 (Table 1).

Expression levels of 10 candidate reference genes
Relative Ct values are widely used as a simple way to identify stably expressed genes by qRT-PCR. Gene expression analyses of the 10 candidate genes exhibited a narrow mean Ct value range across all the experimental samples ( Figure 1). The Ct values of the candidate reference genes under the same threshold value for fluorescence ranged from 9.18 for 18S to 26.00 for ACT2, which were the most and least abundant transcripts, respectively. There were no much differences among the average Ct values for each gene, and the range of values was consistently narrower in individuals than in tissue samples, when the two sample sets consisting of the developmental stage samples and the tissue ones in three larval physiological stages were compared ( Figure 1B and 1C). The amplification of 18S, which was generally highly expressed, produced much lower Ct values (mean Ct = 12.41) than did other genes overall.  Figure 1A). The Ct values obtained for the target gene DSP varied in different samples, ranging from 18.23 (female pupae) to 27.51 (midgut of 5 th feeding larval stage). Therefore, the standard errors of Ct values obtained for DSP were larger than those of all of the candidate reference genes studied ( Figure 1).

Expression stability of candidate reference genes
Ct values of the 10 candidate reference genes were obtained in each sample, and variations in their expression were assessed by Bestkeeper. Ct values converted to linear values were used as input data for subsequent analyses with geNorm and NormFinder.

BestKeeper analysis
The high correlation of expression levels is the key principle for identification of stable reference genes, which ideally should display similar expression patterns across samples. The program BestKeeper was used to determine variations in expression and standard deviations (SD) of 10 candidate reference genes and a target gene. Examination of the standard deviations (SD) ( Table 2) revealed that the candidate reference genes were not all stable across different samples, because some showed SD values higher than 1.0. The variations were diverse in different sample groups.
Due to high variability as presented with SD (6 CP) .1.0, the following genes in the indicated samples were excluded: EF1, GAPDH, and SOD in all tissues samples; EF1 and 18S in molting stage samples; EF1, GAPDH, SOD, and 18S in feeding stage samples; TUB in wandering stage samples; SOD in fat body and hemolymph samples; GAPDH and SOD in head samples; ACT1, ACT2, EF1, and GAPDH in midgut samples; and ACT1, ACT2, and TUB in female samples. The target gene DSP showed the highest variations with SD values of nearly 1.0. Other genes in each experimental condition were ranked based on Pearson's correlation coefficient (a higher coefficient indicates greater stability of expression) ( Table 2). Interestingly, despite displaying acceptable stability levels (i.e., far below the default limit of SD 1.0), SOD had the highest standard deviation, indicating that it was the least stable of the candidate reference genes. Since the expression of 18S was exceptionally high and variable across the different treatments ( Figure 1 and Table 2), it was excluded from further analyses.

geNorm analysis
Next, the geNorm software was used to determine the expression stability of the selected candidate genes in different samples. The expression ratio of two suitable reference genes should be constant across different samples, which is the underlying principle followed by the geNorm program. Two parameters defined by the program were used to assess the stability of reference genes: M (average expression stability) and V (pairwise variation). In each group of samples, the M stability value for each gene, which is inversely related to expression stability, was obtained as the average pair-wise variation in the transcript levels of one gene with respect to all other reference genes tested. V values were determined with all other control genes as the SD of the logarithmically transformed expression ratios.
The gene with the highest M value was considered to have the least stable expression. Thus, the tested reference genes were ranked according to the stability of their expression by stepwise exclusion of the gene with the highest M value (Figure 2). Starting from the two most stable genes on the right, the genes are ranked according to reducing expression stability, ending with the least stable gene on the left. From all of the expression data of the tissue sample groups examined, EF2, L10, and L17A were the three most stable genes, suggesting that they play housekeeping roles and may be widely used for multiple conditions (Figure 2). While the ACT1 and ACT2 genes with an M value of 0.3990 were most stably expressed throughout the developmental stages ( Figure 2J), EF2 and L10 showed the higher stable expression in pupae ( Figure 2L). The highest ranked genes were ACT1 and ACT2 for larvae and male groups ( Figure 2K, 2N); ACT1 and GAPDH for adults ( Figure 2M); and ACT2 and TUB for female groups ( Figure 2O). All tested reference genes reached high expression stability with M values below 1.1, far below the default limit of 1.5 for defining stably expressed genes.
It has been reported that more than one reference gene is required for accurate normalization [16]. When the use of additional genes was equally informative, the pairwise variation (Vn/Vn+1) between the sequential normalization factors (NF) (NFn and NFn+1) was calculated by geNorm to determine the optimal number of reference genes for each experimental condition (Figure 3). The cut-off value of pairwise variation of 0.15 was proposed to indicate that inclusion of an additional reference gene would be unnecessary.
Analysis of the pairwise variation in all developmental stages samples revealed a significant decrease with the inclusion of a fifth gene (Figure 3). Normalization factors should preferably consist of at least five reference genes, because the pairwise variation of the V2/3, V3/4, and V4/5 values were 0.158, 0.175, and 0.159, respectively, all of which exceeded the threshold of 0.15, while the pairwise variation of the V5/6 value was 0.1118. Based on this analysis, EF1, SOD, TUB, ACT1 and ACT2 should be ideal reference genes for normalizing gene expression data in all developmental stages samples. Analysis of the pairwise variation in other samples revealed that two reference genes may be sufficient to normalize expression values of target genes. Therefore, the two most stably expressed genes, mentioned above for each type of samples were selected as reference genes.

NormFinder analysis
Finally, the NormFinder software tool was also employed to investigate each type of sample. This algorithm is used for identifying the optimal normalization gene among a set of candidate genes. When analyzing expression data using the qRT-PCR method, the software provides a stability value for each gene, which is the estimated expression variation if such gene is used for normalization. The candidate normalization genes were ranked according to the stability of their expression patterns between subgroups of the sample set under a given experimental condition. The lower average expression stability values represented more stable gene expression within the gene set examined.
Similarly to geNorm, the top-ranked candidates in different sample groups were analyzed by NormFinder (Table 3). Among all tissues, L10 ranked one of the three most stable genes, while EF2 and L17A ranked between the top four genes (except in hemolymph and midgut samples), in agreement with results of the other two programs. The ranking in the two sex sample groups showed significant differences compared with other body samples. For example, the L10 gene ranked among the four most stable genes for pupae, adult and all larvae sample groups, but it ranked last in the sex sample groups. The SOD gene also ranked better in male and female samples than in others samples. All ranking results are summarized in Table S1.

Consensus list of reference genes
Because of the different algorithms used and the different sensitivities toward co-regulated reference gene candidates, the three software tools offered different ranks in each sample group. Although rankings of the most suitable reference genes were not identical, the three best reference genes identified by the different methods were similar, and they only varied in their relative rank positions (Table S1). Finally, the highest ranking reference genes were identified (Table 4). Interestingly, L10, EF2 and L17A ranked highest in different tissue groups, and ACT2 ranked as the fourth most stable (except in molting stage samples), indicating that these four genes could be selected as the best reference genes for tissue research in S. exigua.
In different whole body samples, the last ranking reference genes showed significant differences. The best reference genes were ACT2, ACT1 and L10 for larvae groups; GAPDH, ACT1 and ACT2 for pupae and adults; SOD and L17A for males; and EF2 and SOD for females (Table 4). Thus, for all developmental stage samples, using the five genes SOD, ACT2, GAPDH, EF1 and ACT1 together should provide reliable results in expression studies of S. exigua.

Target gene expression
In order to demonstrate the effect of reference genes on target gene expression data, the relative expression of the target gene DSP was investigated under different experimental conditions. Target expression analyses further showed that differences in quantification were detected when normalizing with arbitrary internal controls relative to the best reference genes. The best or the most unstable reference genes were selected based on their rank order of expression stability among the 10 candidates evaluated in this study.
Arbitrary selection of reference genes may thus decrease the accuracy of calculating target gene expression, since such a normalization strategy can over-estimate or under-estimate differences in expression level among different samples. For example, the relative expression level of DSP showed no significant differences between adult male and female samples when calculated using 18S as the reference gene; however, its expression was significantly different when normalized by other reference genes (such as ACT1, ACT2) ( Figure 4J). Similar changes also occurred in calculating relative expression levels of DSP after normalization by other unstable reference genes, such as GAPDH in the molting stage, TUB in the molting stage, SOD and 18S in the epidermis, SOD in the head, EF1 and SOD in hemolymph, TUB in all larvae groups and EF1 and SOD in pupae ( Figure 4A, 2C, 2D, 2F, 2G, 2H and 2I).
Relative expression levels of the target gene in different samples were even more divergent if calculated using arbitrary reference genes. For example, when using EF1 or SOD as reference gene, the fat body DSP expression in the feeding stage was lower than that in the wandering stage; however, after using other reference genes, the conclusion was modified, and they were determined to be at the same level ( Figure 4E). Similar errors could be produced when using EF1 in the molting stage, and ACT2 or SOD in the feeding stage ( Figure 4A, B). Thus, determination of the optimal reference genes is important for accurate normalization of qRT-PCR data, especially when differences in expression levels are subtle.
Inaccurate conclusions were made when certain reference genes were used for normalization. While the larvae DSP expression level was higher in 3 rd larvae when using GAPDH as the reference gene, it exhibited a significant age-dependent decrease in larvae when normalizing with ACT1 or ACT2 ( Figure 4H). Similar results were found when using 18S in the molting stage and GAPDH in the head tissue (Figure4C, 4F). Taken together, results of this study showed that the selection of reference genes for qRT-PCR data normalization varied on a case-by-case basis. Thus, in order to obtain accurate expression data, any given sample set must be assessed using the panel of selected candidate reference genes.  Discussion qRT-PCR quantification requires robust normalization by reference genes to offset confounding variations in experimental data. However, improper selection of reference genes can conceal or magnify real biological changes due to changes in the reference gene expression [50,51]. Furthermore, using a single endogenous control can also profoundly influence the statistical outcome and may lead to inaccurate data interpretation [52]. Each candidate reference gene should be evaluated under specific experimental conditions for gene profiling to ensure a constant level of expression [40]. In this study, we examined 10 candidate reference genes from S. exigua and analyzed the stability of these genes across various sample sets using three analytical software programs. These genes involved in ubiquitous cellular processes represent those commonly used as single normalizer in S. exigua gene expression studies. The studied sample subgroups included detailed life developmental stages (prepupae and every instar of the larvae), both sexes and five different tissues in three larval physiological stages. Although different ranks were offered by the three analytical tools, the combined results ultimately provided recommendations for the optimal reference genes. The different rankings of the reference genes in different sample sets in this study illustrated the need for evaluating their use under various experimental conditions. Compared with a previous report [39], our work provides a more complete set of information for the selection of reference genes in S. exigua.
A major conclusion arising from our results is that none of the candidate reference genes could serve as a ''universal'' normalizer that would maintain a constant expression level across all experimental conditions. Although most of these candidates are considered to be classical housekeeping genes and are widely used for data normalization, they exhibited considerable variations in expression stability among the different samples. The results of this study emphasized that the stability of reference gene expression must be verified under all experimental conditions to be investigated. Given that all internal reference genes are regulated to some extent and if none are constitutively expressed for each experimental treatment, then a combination of reference genes that would best fulfill the universality criteria should be selected (i.e., L10, EF2, and L17A across different tissue subgroups in our study).
In contrast to the findings of Teng et al [39], who selected relatively stable reference genes just from four candidate housekeeping genes in certain tissues from final instar larvae and developmental stages, we investigated the influence of many more variables (e.g. five different tissues in three larval physiological stages and life developmental stages) on the expression stability of the studied genes. The authors of the study above chose GAPDH as one of the most stable reference genes, while GAPDH ranked last among the reference genes across most sample sets in the current study (Table S1). Assessing a low number of initial candidate genes would lead to such a misleading result. Although the GAPDH gene ranked in the first three in all developmental stages sample subgroup, normalizing the expression of a target gene just by one reference gene is not ideal. Since the GAPDH gene ranked highly in some sample subgroups, it may be used as the reference gene in certain experimental conditions but not as a sole universal normalizer.
While stability of reference genes still must be determined on a case-by-case basis in S. exigua studies, certain genes may be preferred for normalization in experiments involving different treatments. The lowest ranking reference genes also showed significant differences across different whole body sample subgroups. This observation indicated that candidate genes in body subgroups showed more variations than in tissue subgroups.
Results of this study suggest that more complex sample sets will exhibit higher variability in the reference genes. To determine the optimal number of reference genes, the pairwise variation (Vn/ Vn+1) between the sequential NF (NFn and NFn+1) was calculated by geNorm. After the analysis, two reference genes were found to be sufficient for normalizing expression values of target genes in most of the samples, but five reference genes were needed in all of the life developmental stages samples (Figure 3), indicating that larger sample sizes require a higher number of reference genes for accurate normalization. The same results were obtained in a Drosophila reference gene selection study [53]. Along the same way, in an aging-related study, nine reference genes were sufficient for three samples, whereas 13 reference genes were needed when nine samples were tested. Finally in a neurodegeneration-related study, one reference gene was feasible for three samples; however, the number of the reference genes needed to be increased to six for accurate normalization in nine samples. Perhaps additional reference genes are required when adding more samples into a study, because it would be harder to reach the minimum value of Vn/n+1, due to the introduction of more unstable factors. Compared to the other reference genes tested, 18S ranked low in most experimental conditions and displayed an excessively high expression level, excluding it as a potential reference gene. However, the differences in rRNA and mRNA fractions between samples limit the use of 18S as a normalizer in qRT-PCR analyses [1,40,54]. In other words, rRNA cannot be used for correcting sample-to-sample variation in the quantity of mRNA, as it has been shown on occasion to fail to be representative of mRNA levels [55]. This shortcoming may explain the high coefficient of variation of 18S in our study. The low Ct values of 18S observed in our study (Figure 1) reflect the abundance of these transcripts. Thus, in order to use 18S as a reference gene, the samples would need to be diluted sufficiently to keep the rRNA within the range of detection. However, the target gene would also be diluted further, potentially leading to an over-estimation or underestimation in differences of target gene expression among different conditions ( Figure 4). Therefore, we suggest that18S should be excluded as a reference gene in qRT-PCR, since the expression of other candidates proved to be more stable.
SOD was never listed in the top three ranked reference genes across tissue sample sets and some of the whole body sample sets in our study (Table 4 & Table S1). Similarly, the glycolytic enzyme GAPDH was selected as a suitable reference gene for only three out of fifteen samples (i.e., pupae and adult), even though it has been reported as a good normalizer in previous gene expression studies of S. exigua and other insect species [38]. In contrast, L10 and L17A, were found to be stably expressed across the different samples. Though they all encode the structural constituents of the ribosome, it was reported that L17A was related to the pupal diapause regulation in the insect [56] and L10 was a component of an antiviral signaling [57]. These observations indicate that the two ribosomal protein genes are not co-regulated and can be regarded as independent reference genes. Additionally, EF2 ranked at the top as a reference gene in most samples in this study, but it has rarely been previously used as a normalizer. Interestingly, EF1 ranked last in some sample sets in our study. The elongation factors (i.e., EF1 & EF2) play an important role in translation by catalyzing GTP-dependent binding of aminoacyl tRNA to the acceptor site of the ribosome. Recently, a number of studies have reported that it is a suitable reference gene in different species, including salmon [58,59], humans [60] and Orthoptera [61,62].
Actin, as the major component of the protein scaffold which supports the cell and determines its shape, was also selected as a good reference gene under some conditions. It was expressed at moderately abundant levels in most samples. Even though two actin genes were selected as candidate reference genes in our study, they have been reported to be unsuitable for normalizing qRT-PCR data due to large measurement errors [4]. On the other hand, actin has ranked at the top as a reference gene in expression studies in the desert locust [61], European honey bee [38], two species of Collembola [63] and the salmon louse [58].
Taken together, the simultaneous measurement of a panel of candidate reference genes is essential for quantification by qRT-PCR. As empirically-determined or pre-validated reference genes may yield inaccurate results, data normalization needs to be optimized for each particular assay.

Conclusion
In our study, several reference genes suitable for normalizing qRT-PCR data in S. exigua were identified. Although most of the selected candidates exhibited stable expression patterns acceptable for reference genes, some showed the highest stability in different experimental conditions. While the expression levels of L10, EF2, and L17A were most stable in different tissue sample sets, the best reference genes selected were ACT2, ACT1, and L10 for larvae samples; GAPDH, ACT1, and ACT2 for pupae and adults; SOD and L17A for males; and EF2 and SOD for females. Overall, five genes, SOD, ACT2, GAPDH, EF1, and ACT1, were determined to be most reliable when used together to analyze all developmental stage sample groups in S. exigua.

Supporting Information
Table S1 Ranking of candidate reference genes according to their stability value using BestKeeper, geNorm, and NormFinder analyses. Candidates are listed from top to bottom in order of decreasing expression stability. (DOC)