Identification and Evaluation of Reference Genes for Quantitative PCR Normalization in Alligator Weed Flea Beetle (Coleoptera: Chrysomelidae)

Abstract Stably expressed reference genes are critical internal standards for the quantification of gene transcription levels using quantitative real-time PCR. Housekeeping genes are commonly used as reference genes but their expressions were variable depending on experimental conditions in many insect species studied. Here we report the identification and evaluation of 10 housekeeping genes in alligator weed flea beetle, Agasicles hygrophila Selman & Vogt (Coleoptera: Chrysomelidae), a biocontrol agent of alligator weed. The 10 housekeeping genes are: beta-actin (Actin), ribosomal protein L13A (PRL13a), succinate dehydrogenase complex subunit A (SDHA), ribosomal protein S20 (RPS20), ribosomal protein S13 (RPS13), glyceraldehyde phosphate dehydrogenase (GAPDH), TATA-box-binding protein (TBP), ribosomal protein L32 (RPL32), tubulin alpha-1 chain (TUBULIN), and elongation factor-1 alpha (ELF). Five programs, geNorm, NormFinder, BestKeeper, ΔCt method, and RefFinder, were used to evaluate the expression stability of the 10 genes among various A. hygrophila body parts and with different nutrient types (starvation, diet types). The expression stability analysis showed that RPS32 and RPL13a were reliable reference genes for the study of gene transcription in different body parts; Actin and RPL13a were optimal reference genes for different nutrient types. The selections of reference genes were validated using a CarE gene (GeneBank No: KX353552). The results of this study provide useful bases for studies of gene expression in various aspects relating to A. hygrophila.

Alligator weed flea beetle, Agasicles hygrophila Selman & Vogt, is a well-established biological agent for the control of alligator weed [Alternanthera philoxeroides (Mart.) Griseb (Amaranthaceae)] (Spencer and Coulson, 1976;Ma and Wang, 2005). As a successful example of biological control agent, A. hygrophila has been studied regarding biology, physiology, ecology, and application security (Maddox 1968, Stewart et al. 1999, Guo et al. 2012. A. hygrophila is a monophagous insect feeding exclusively on A. philoxeroides. This insect-plant relationship remains unbreakable in any geographical environment. For example, A. hygrophila was introduced into China 30 yr ago and its host shift has not been observed so far (Ye et al. 2003, Lu et al. 2010. Therefore, this unique interaction between A. hygrophila and A. philoxeroides has lent itself a model system for studying of herbivore-plant specificity. However, there is little understanding of the underlying mechanism of this strong herbivore-plant specialization. In recent years, studies have been conducted primarily with A. philoxeroides, in the aspects of genetic variation (Ma et al. 2013), gene expression in response to abiotic stresses (Jia et al. 2014(Jia et al. , 2020aGuo et al. 2017), as well as olfactory cues in herbivore host shifting (Li et al. 2017). Research of A. hygrophila at the molecular level is just emerging (Zhang et al. 2018(Zhang et al. , 2019Jia et al. 2020a, b).
Quantitative real-time PCR (RT-qPCR) is the most common tool for gene transcription analysis because of its sensitivity and repeatability (Bustin et al. 2005, Nolan et al. 2006). However, RT-qPCR outcomes are highly dependent on the quality and integrity of the RNAs, the quality and quantity of the template cDNAs, primer specificity, and amplification efficiency. To normalize these variations, internal reference genes are critical for the accurate quantification of the target gene expression. An ideal reference gene should be expressed ubiquitously and insensitive under various experimental conditions.
Most reference genes of RT-qPCR are housekeeping genes because their expressions are ubiquitous and stable regardless of environmental conditions (Pan et al. 2015, Sun et al. 2015. Examples of such genes are NADH oxidase (NADH; Li et al. 2013), beta-actin (Actin; Zhai et al. 2013), glyceraldehyde-3-phosphate dehydrogenase (GAPDH; Zhu et al. 2014), and 18S ribosomal RNA (18S rRNA; Nicot et al. 2005). However, the expression of these popular internal reference genes sometimes varies significantly depending on sample types or experimental conditions (Fu et al. 2013, Li et al. 2013. Several studies have indicated that the selection of internal control genes is critical for gene transcription analysis (Arun et al. 2015, Sun et al. 2015 as normalization with unsuitable internal control genes can lead to false results. In fact, fewer genes are stably expressed and suitable for gene expression analysis for all cell and tissue types, or in various experimental conditions (Yang et al. 2014a. Therefore, the expression profiles of housekeeping genes under various conditions for a given insect species require meticulous evaluation. The stability of housekeeping genes in A. hygrophila has not been explored. To identity suitable reference genes in A. hygrophila, 10 candidate reference genes were selected from our in-house transcriptome database. The expression stability of these genes was evaluated in major body parts and nutrient types (starvation, fed with host or non-host plants). Five algorithm-based methods, NormFinder (Andersen et al. 2004), geNorm (Vandesompele et al. 2002), BestKeeper (Pfaffl et al. 2004), the ΔCt method (Nicholas et al. 2006), and RefFinder (Faten et al. 2014) were used to evaluate and rank the stability of these 10 candidate genes for their suitability as reference genes. The results provide much needed guidance for selecting reliable reference genes in gene expression studies in A. hygrophila.

Insects
An A. hygrophila colony was established in 2007 from adults collected from A. philoxeroides grown at the campus of South China Agricultural University (Wushan, Guangzhou, Guangdong). Since then, the insects had been maintained in a growth chamber (PRX-450C) under the conditions of 26°C, of 85 ± 5% RH, and a photoperiod of 12:12 (L:D) h at the College of Plant Protection, Shanxi Agricultural University. Adult insects (in groups of 10-15 individuals, mixed sexes) were placed in glass jars (7 cm in diameter and 8 cm in height with moist filter paper at the bottom) containing fresh alligator weed plants. The jars were covered with fine muslin cloth fastened with rubber bands. Females laid eggs in clusters on the abaxial surface of leaves. Leaves with eggs were collected and placed in petri dishes (15 cm in diameter) with moist filter paper at the bottom and fresh alligator weed shoots as food source for hatched larvae. The dishes were covered with perforated plastic wrap fastened with rubber bands. When the larvae reached at the third instar, they were transferred to glass jars with fresh alligator weed stems until adults. The alligator weed plants (shoots or stems) were replaced daily.

Sample Preparation
The samples of A. hygrophila body parts: head, midgut, and residue part (body without head and midgut), were collected from 30 adults of mixed sexes (second day after emergence). Four independent biological replicates were prepared. All the samples were flash-frozen in liquid nitrogen and kept at −80°C until RNA extraction.
For nutrient type, 20 or 30 newly emerged adults (<12 h old, male: female = 1:1) of A. hygrophila were used as one replication. With the starvation treatment, the 20 adults were placed in a glass jar with only moist filter paper at the bottom; while for the host plant treatment, 20 adults were placed in a glass jar supplied with alligator weed leaves. For the non-host treatment, individual insect was supplied with the leaves of Beta vulgaris var. cicla. Thirty adults were prepared to ensure that at least 20 adults had consumed the supplied leaves and could be used for further analysis. All plant materials were replaced twice daily and kept moist with damp filter papers at the bottom of the jar. There were four biological replications (20 beetles each) for each nutrient treatment. After 48 h, the beetles (in groups of 20) were collected for RNA extraction and further analysis.

Total RNA Extraction and cDNA Synthesis
Total RNA was extracted from each insect sample using Trizol reagent (Invitrogen, Carlsbad, CA). To remove potential genomic DNA contamination, the extracts were treated with RNase-free DNase I following the manufacturer's instructions, and then purified using RNeasy spin columns (Qiagen, Valencia, CA). The RNA was quantified using the NanoVue UV-Vis spectrophotometer (GE Healthcare Bio-Science, Uppsala, Sweden) and examined for its integrity by 1% agarose gel electrophoresis. The first-strand cDNAs were synthesized from 4 µg of total RNA of each sample with an oligo (dT) 18 primer and M-MLV reverse transcriptase (Fermentas, New York, NY).

Reverse Transcription Quantitative PCR (RT-qPCR)
RT-qPCR experiments were carried out using the Biosystems 7500 real-time PCR system (Applied Biosystems Inc, Foster, CA) with SYBR Premix Ex Taq TM II kit (Takara, Dalian, China). The cycle parameters consisted of an initial step at 95°C for 10 s, followed by 40 cycles of 95°C for 5 s, 60°C for 31 s, and a final melting curve analysis. The reaction volume was 20 µl containing 10 µl SYBR GREEN Real-time PCR Master mix (TOYOBO), 0.8 µl of each primer (10 mM), 2 µl template (20× diluted), and sterilized water. There were two technical duplicates for each of the four independent biological replications.

Expression Stability Analysis
The transcription level of each candidate gene was calculated from average Ct value. The expression stability was evaluated with the ΔCt methods (Nicholas et al. 2006), geNorm (Vandesompele et al. 2002), BestKeeper (Pfaffl et al. 2004), NormFinder (Andersen et al. 2004), and RefFinder (Faten et al. 2014) for comprehensive ranking of the tested candidate genes. All evaluations were conducted properly following the instructions of the software.

Reference Gene Validation
A CarE gene (GeneBank No: KX353552) was used to validate the selected reference genes using the 2 −ΔΔCt method. The transcription levels of this gene were estimated using the most stable (NF 1 ) and the least stable reference gene (NF 1-2 ) and the worst stable (NF 10 ) reference gene identified by RefFinder for the samples of different body parts and of different nutrient types. When normalizing using two reference genes, geometric mean was taken as the normalization factor (NF 1-2 ) which was calculated from the cycle threshold values of the two reference genes. The results were expressed as the mean ± SE.
The data were statistically analyzed using SPSS software (SPSS, Chicago, IL). One-way ANOVA followed by Turkey's multiple comparison tests were performed for the effect of reference genes. Statistical difference was claimed when P < 0.05.

Identification of Reference Gene Candidates
Multiple EST sequences for each selected reference gene candidate were obtained through key word search of the transcriptome dataset previously generated from A. hygrophila under the conditions of different nutrient types. The full-length coding sequence of each gene was further blasted against the NCBI database to confirm the gene identity. All 10 candidate genes were submitted to GenBank. The accession numbers and primer sequences used for qRT-PCR are listed in Table 1.

PCR Amplification Efficiency
Each primer pair of tested genes resulted in a single PCR product as displayed by a single band on the agarose gel or a single peak after melting curve analysis using RT-PCR or RT-qPCR, respectively (Suppl Fig. S1 [online only]). As shown in Table 1, the PCR efficiencies were between 92.14% (Tubulin) and 100.07% (RPL32) and the coefficients (R 2 ) were >0.99 for all 10 candidate genes as measured using LinRegPCR program (Table 1).

Expression Profiles of the Reference Genes Candidates
The relative abundance and variation of each gene were indicated by the mean and deviation of the Ct values from the 28 samples examined; the lower the Ct value the higher the abundance (Fig. 1).

Expression Stability of the Reference Gene Candidates
Four commonly used statistical programs of geNorm, Normfinder, BestKeeper, ΔCt, and a comprehensive statistical program RefFinder were used to evaluate the expression stability of the 10 candidate reference genes in different types of samples. For the samples of different body parts, all programs, except for BestKeeper, identified RPL32 as the most stable gene (Table 2). According to RefFinder, the overall order of these genes from the most stable to the least stable is: RPL32, RPL13a, TBP, SDHA, ELF, RPS13, GAPDH, RPS20, Actin, and Tubulin ( Fig. 2A). The geNorm analysis revealed that the pair-wise variation value V2/3 was 0.051, which is far less than 0.15, suggesting that two reference genes were enough for accurate normalization of gene expression in body part samples (Fig. 3).
For samples of different nutrient types (starvation, fed with host or non-host plant), Actin was identified as the most stable gene by geNorm, BestKeeper, and ΔCt ( Table 2). The overall ranking (from most stable to least stable) by RefFinder is as the following: Actin, RPL13a, RPS20, Tubulin, SDHA, GAPDH, TBP, RPL32, RPS13, and ELF (Fig. 2B). This ranking was very different from that of different body parts, suggesting the necessity of selecting different internal reference genes for different tissue types or experimental conditions. When all sample types were considered, TBP and RPL13a were the most stable genes identified by Normfinder, BestKeeper, and ΔCt ( Table 2). The overall stability ranking by RefFinder was as the following: TBP>RPL13a>Actin>RPL32>RPS20>RPS13>GAPDH>S DHA>Tubulin>ELF (Fig. 2C). The geNorm analysis revealed that the first V-value < 0.15 appeared at V2/3, suggesting that two reference genes were enough for accurate normalization of all conditions (Fig. 3).

Validation of Candidate Reference Genes
To validate the candidate reference genes, a CarE gene (GeneBank No: KX353552) was chosen. According to transcription profiling data collected using the next generation sequencing, this CarE gene was upregulated in A. hygrophila under starvation condition compared to feeding on the host plant A. philoxeroides (Y.-Q.G. et al., unpublished results). When Actin or RPL13a was used as the reference gene, significantly higher CarE transcription level was detected in the starvation insect group than those feeding on the host plants which was consistent with the result from sequencing. However, a different expression pattern was detected when the worst stable gene (ELF) was used as a reference that the transcription levels of CarE did not differ significantly between insects of starvation and feeding on host plants (Fig. 4A).
The CarE gene was expressed at lower levels (<2-folds) in heads and body parts than midguts regardless of the reference gene(s) used. However, for the midgut samples, the transcription level of CarE gene was comparable when two stable reference genes RPL32 or RPS13a were used, but was significantly lower when the worst stable gene (Tubulin) was used (Fig. 4B).

Discussion
Reference gene(s) is critical for the normalization of target gene expression using RT-qPCR. In this study, we examined 10 internal candidate reference genes from A. hygrophila and evaluated their expression stability with five statistical algorithms. Our results showed that none of the candidate reference genes could serve as a 'universal' normalizer. According to RefFinder, which assigns an appropriate weight to an individual gene and gives the overall final ranking, RPS32 was the most stably expressed gene in samples of different body parts. RPL13a appeared to be the best normalization factor for samples of different nutrient types. These results were in line with the reports in Acyrthosiphon pisum (Yang et al., 2014a), Drosophila suzukii (Zhai et al., 2014), and Locusta migratoria (Yang et al., 2014b). It emphasized that the stability of reference gene expression must be verified for different sample typed to ensure a constant level of expression (Thellin et al. 1999). RPL32 (ribosomal protein L32) encodes a ribosomal structural protein that is a component of the 60S subunit (Vorobieva et al. 2008). Previously RPL32 was shown to be an optimal reference gene to normalize RT-qPCR data in Kuruma shrimp (Sellars et al. 2007) and for the analysis of behavioral plasticity in Australian plague locust (Chapuis et al. 2011). In the current study, RPL32 was ranked at the top in stability for samples of different body parts but ranked lower (as the eighth of the 10 genes) for samples of different nutrient types.
Like RPL32, RPS13, RPL13a, and RPS20 are all ribosomal proteins (Kenmochi et al. 1998). In this study, RPL13a was ranked consistently high in all samples, while by contrast RPS13 and RPS20 demonstrated to be less stable among samples of different body parts and nutrient types even though they are also ribosomal proteins.
Our results in A. hygrophila correlated with those in Cryptolestes ferrugineus (Tang et al., 2017) and Helicoverpa armigera (Zhang et al., 2015) in which both RPS20 and RPS13 appeared to be poorly stable genes under various situations.
Actin has been widely used as the 'universal' reference gene (even without any validation) as it is considered to be a critical component of the protein scaffold in cytoskeleton maintenance, shape determination, and cell motility. However, our study found that Actin displayed very low stability compared to other references particularly in the samples of different body parts ( Fig. 2A). Several other studies have also revealed that the expression of Actin fluctuated with sample types (e.g., developmental stage, tissue types, etc.; Moshier et al. 1993, Deindl et al. 2002. Furthermore, our results showed that ELF and Tubulin were poor choices as RT-qPCR reference genes as they were not ranked in the top three reference genes in all cases (Table 2). GAPDH, RPS20, RPS13, and SDHA were ranked in the middle range in most experimental conditions in this study.
To determine the optimal number of reference genes required for samples of body parts or nutrient types, the pairwise variation (Vn/Vn+1) between two sequential normalization factors (NFn and NFn+1) was calculated by geNorm. A large variation means that an additional reference gene should be abandoned for the calculation of a reliable normalization factor. In this study, we proposed 0.15 as a cut-off value and the V2/3 values were all below 0.15. Therefore, we conclude that two best reference genes were enough for normalizing expression values of target genes under these tested conditions. The validation using the CarE gene provided further information that using two most appropriate genes as references resulted in a more accurate estimation of target gene transcription level than using a single gene. The current study suggests that under most experimental conditions, a single reference gene may not be enough for normalization of gene expression. Two or more reference genes are required to achieve accurate and reliable results (Vandesompele et al. 2002). Our results also demonstrated that the application of the least stable reference gene could result in false interpretation.

Conclusions
This current study provides a detailed assessment of different candidate reference genes for RT-qPCR studies of A. hygrophila with different sample types (body parts and nutrient types). RPS32 and RPL13a were found to be most reliable reference genes for samples of different body parts, while Actin and RPL13a were optimal  reference genes for samples of different nutrient types. This work further demonstrated the importance of reference gene selection and the benefit of combination of at least two reference genes for providing accurate quantification of gene transcription using RT-qPCR. The results of this research provide useful bases for future research in relation to gene transcription in A. hygrophila.