Internal Reference Gene Selection for Quantitative Real-Time RT-PCR Normalization in Potato Tissues

Quantitative real-time PCR (qRT-PCR) is widely used for investigating gene expression patterns and has many advantages, including its high sensitivity, fidelity, and specificity. Selecting a satisfactory internal reference gene is crucial for obtaining precise gene expression results in qRT-PCR analyses. In this study, the transcriptomic data of 2 potato varieties were screened for housekeeping genes with stable expression patterns. A total of 77 putative genes were selected, which were highly and stably expressed. Then, qRT-PCR analyses were performed to examine the expression levels of these 77 candidate reference genes in various potato tissues, including leaves, flowers, stolons, and tubers. Gene expression was represented by analyzing the Ct values at given threshold. Through geNorm and NormFinder program analyses, 10 candidate genes with the most stable expression patterns were obtained, including RPL19, RPS15, RPS9, EF1α, TrxP1, RPS8, NTF, CAM, AACM, and RPS28. Moreover, through the comprehensive analyses of 4 statistical algorithms (i.e., geNorm, NormFinder, BestKeeper, and RefFinder), results indicated that the most appropriate internal reference genes were RPL19 and EF1α. The obtained stable reference genes will contribute to future qRT-PCR analyses on potato tissue-related gene expression.


Introduction
Potato belongs to the Solanaceae herbaceous perennial plant tuber and is now the fourth most important food crop in the world, following wheat, rice, and corn. As a result, potato acreage and output have increased dramatically over the past 50 years. Additionally, potatoes provide more food energy and nutrition per unit using the same amount of or fewer fertilizers than other cereal crops. Currently, many genetic studies are being conducted to raise the productivity and quality of potatoes. Quantitative real-time PCR (qRT-PCR) is one of the most available analytical tools used in gene expression analyses; however, its accuracy is heavily dependent on the standardization strategy of unchangeable reference genes [1].
traditional PCR, qRT-PCR is more powerful due to its high sensitivity, specificity, reproducibility, quantitative accuracy, and high-throughput characteristics [2]. It has become one of the most commonly used genomic research techniques in the field of plant sciences [1,3]. However, many factors affect the accuracy of gene expression analysis when using qRT-PCR techniques, such as the productivity and integrity of RNA, efficiency of reverse transcription, and stability of the internal control gene, among other impediments [4]. Therefore, in order to minimize the difference between the samples and obtained data with high accuracy, selecting the appropriate internal reference gene for calibration and standardization is crucial [5]. However, currently, there are no constant internal reference genes. Usually, 1 or more internal reference genes are required to standardize qRT-PCR results [6]. A good combination of internal control genes can minimize abiotic variation [7]. Thus, the choice of a suitable internal control gene is of great importance for qRT-PCR experiments, as an ideal internal reference gene needs to be stably expressed in all tissues [8]. Thus far, several plants have had reference genes identified under different treatments and conditions, such as rice [9], grapevine [10], sugarcane [11], wheat [12], Arabidopsis thaliana [13], tomato [14], and tea [15].
To our knowledge, reference gene combinations that are stably expressed in different potato tissues have not been previously reported. Thus, selecting 1 or more than one suitable reference genes is beneficial to the future studies on potatoes using qRT-PCR. RNA-Seq data are subsequently used to select candidate genes and examine the stability of putative reference genes. This study concentrated on candidate genes with relatively high expression levels in various potato tissues to evaluate their expression variability. The qRT-PCR data were analyzed using 4 prevalent programs, including geNorm [28], NormFinder [8], and BestKeeper [29], to determine the appropriate novel genes for gene expression studies under different circumstances. Then, RefFinder was used to order the genes acquired from geNorm, NormFinder, and BestKeeper. The finding of this study will contribute to the accurate and reliable normalization of gene expression studies on potatoes.
concentration and quality of RNA samples were detected using a NanoDrop 2000C ultraviolet spectrophotometer (Thermo Fisher Scientific, MA, USA). First-strand cDNA was synthesized using an FSK-100 ReverTra Ace kit (Toyobo, Osaka, Japan) with the addition of 2 μg of RNA from each reaction. Finally, the cDNA solutions were diluted (1:10) with ddH 2 O and stored at −20°C for subsequent qRT-PCR reactions.

Potential Reference Gene Selection
To ensure the correctness and reliability of potential reference gene selection, the fragments per kilobase per million reads (FPKM) values of 2 potato strains, the doubled monoploid DM1-3 (DM) and heterozygous diploid RH89-039-16 (RH), were obtained from Potato Genome Sequencing Consortium (PGSC) database. First, genes with an FPKM value >50 in all tissues of both potato strains were selected. In total, 39,000 genes were identified. Then, the average expression (AVE) and standard deviation (SD) of selected genes from various tissues were calculated. The coefficient of variation (CV) was computed as SD/AVE. Genes with CV values < 0.5 were reserved from the DM and RH strains, where DM had 151 genes and RH had 147 genes. Afterward, Venn diagrams were constructed to determine the number of overlapping genes between the DM and RH strains; 77 genes were shared between the 2 strains. Subsequently, qRT-PCR was conducted to test the expression levels of the candidate reference genes. The selected genes were further screened using geNorm and NormFinder to identify the top 10 internal reference genes ( Fig. 1).

Primer Design and qRT-PCR
Forward and reverse primers were searched in the qPCR Primer Database [30,31]. A total of 66 pairs of suitable primers corresponding to 66 genes were found. For the remaining 11 genes, the Primer v6.0 program was used to design primers. The 77 selected genes and primers were synthesized by GenScript Corporation (Nanjing, China) (Tab. S1).
qRT-PCR experiments were conducted to estimate the expression levels of the candidate genes on 96well plates using SYBR Green-based PCR assays, which were fulfilled using the KK4601 SYBR green mix (Kapa Biosystems, MA, USA) in a CFX96 RT-PCR machine (BioRad Laboratories, CA, USA). Experiments were conducted with at least 3 biological repeats, and each biological repeat was performed with 3 technical duplicates. Each PCR mixture (20 μL) contained 1 μL cDNA,10 μL SYBR green mix, 1 μL forward primer, 1 μL reverse primer, and 7 μL ddH 2 O. The PCR procedures were as follows: 95°C for 2 min, 39 cycles at 95°C for 5 s, 60°C for 30 s, and a final melting curve between 65°C and 95°C (Δ0.5°C/s) [18].

Statistical Analyses
The Ct values were analyzed to obtain the stable levels and ranks of the putative reference genes in each potato tissue using geNorm [32], NormFinder [8], and BestKeeper [29] algorithms. The web-based algorithm, RefFinder, which integrates these widely used computational programs, was utilized to assess a recommended comprehensive ranking, as it is convenient method for evaluating the appropriate reference genes for qRT-PCR analysis [27].
The geNorm program was also used to evaluate the reference genes, as it can screen any amount of internal reference genes and select 2 or more internal reference genes to form a combination that corrects the data, thereby making the relative quantification results more accurate. The 2 -ΔCt (ΔCt = the Ct value of each gene-the lowest Ct value of all genes) was used to convert the original Ct values of different genes under different samples [27]. Then, the obtained results were imported into the geNorm program to obtain the gene expression stability value (M). The smaller the M value, the more stable the reference gene. Thus, the gene with the largest M value was the most unstable gene. The program also calculated the paired variation values (V) of the normalization elements to assess the number of optimal internal reference genes [32].
The NormFinder program calculated the M value of each candidate reference gene and the V values among sample groups. However, the program only selects the most appropriate internal reference genes. The method was the same as that of geNorm [8].
The BestKeeper program is relatively simple, as it only calculates the geometric mean (CP) of the fluorescence values obtained from qRT-PCR analyses repeated 3 times then inputs the CP value into the table. When the last CP value was input, the BestKeeper program automatically calculated the correlation coefficient (R), SD, and CV. Based on these 3 values, the stability of the reference genes was estimated [29].
The RefFinder program, which integrates the other 3 programs is a website tool that analyzes the stability of the reference genes by combining the results of the 3 current computing programs to obtain an integrated ranking [33].

Expression Profiles of the Reference Genes
The average Ct values of the 10 putative reference genes from different samples are summarized (Fig. 2). These results indicated that the average Ct values of all candidate genes varied from 20 to 27. The Ct values of EF1α and RPS28 were the lowest, ranging from 20.04 to 22.41 and 20.31 to 24.42, respectively, indicating that the expression abundances of these 2 genes were the highest. NTF had the highest Ct value, ranging from 22.67 to 25.28, indicating that it had the lowest expression abundance. Thus, it was concluded that the changes and differences in the mean Ct values were related to the M values of different genes. Interestingly, the same gene had different M values in different tissues. Thus, it is clearly vitally to select stable candidate reference genes for the data normalization in different tissues.

geNorm Analysis
The 10 putative reference genes were evaluated to obtain M values using the geNorm program, which corrects multiple internal reference genes [28]. The average M values were computed by Ct-converted data. According to the selection principles of the geNorm program, the candidate gene will be regarded as a stable reference gene if M value < 1.5. Genes with the lower M values represent a more stable expression. Results revealed that the M values of the 10 internal reference genes were all < 0.65, much less than the standard 1.5 threshold, indicating that they were relatively stable. Among these genes, RPS28 was the least stable, and the combination of RPS15 and RPS9 was the best gene combination, and both genes were stably expressed. The remaining 7 candidate genes' M values were located between these 2 genes (Fig. 3).
The results revealed that there was more than 1 internal reference gene, which may be a combination of multiple genes depending on the V values [32]. According to the fundamental concept, it is not necessary to add extra genes for reliable normalization unless the value of Vn/n + 1 is > 0.15. In this study, V2/3 indicated the lowest pairwise variation was 0.093, which was < 0.15; thus, for the optimal internal reference genes, 2 should be selected without appending additional genes to normalize accurate results (Fig. 4).

NormFinder Analysis
The M values of all 10 candidate reference genes were analyzed using the NormFinder program, which can compare the expression differences of putative reference genes. Additionally, it can calculate the variation between sample groups, but it can only select the most appropriate candidate reference gene [34]. Based on the results, the reference gene with the lowest M value was regarded as the most stable reference gene. RPL19 had the lowest M value and was ranked first for expression stability. CAM ranked second, and EF1α ranked third. TrxP1 was the most unstable gene with the largest M value (Fig. 5). It is worth noting that RPL19 was ranked second by the geNorm program, indicating that there were some differences between the 2 programs.

BestKeeper Analysis
BestKeeper is an excel program used for analyzing candidate reference gene expression variation. It is used to compare acceptable internal reference genes and identify the expression degree of interesting genes. However, it could only calculate the expression level of 10 reference genes and 10 target genes, which amounted to 100 sample groups. Afterwards, the R, SD, and CV values between each gene were obtained. The larger the R value and the smaller the SD and CV, the better the stability of the internal reference gene. However, when SD > 1, it indicates that the reference gene is unstable. The SD of the 10 selected genes were all < 1 (Tab. 2); thus, they were considered acceptable internal reference genes. The  ranking order was as follows: Therefore, the least stable gene was RPS28, while the most stable gene was TrxP1.

RefFinder Analysis
RefFinder is an online analytical tool that integrates the existing Delta Ct, geNorm, NormFinder, and BestKeeper analytical procedures. By computing the geometric mean of the stable value weights, a comprehensive ranking of each internal reference gene was obtained. The results of RefFinder program are provided (Fig. 6). The ranking in descending order was as follows: RPL19 > RPS15 > RPS9 > EF1α > TrxP1 > RPS8 > NTF > CAM > AACM > RPS28. Therefore, RPL19 was the most suitable gene, matching the results of NormFinder. In contrast, RPS28 was the least stable gene.

Discussion
In the qRT-PCR analyses, the stability of the same reference gene in the same species under different experimental conditions may differ considerably. The stability of reference genes is an important precondition for gene expression research [2,8,35]. Notably, various tissues in the same species may require different internal reference genes. For instance, ACT was the most stable reference gene in different tissues of eucalyptus tereticornis. However, in leaf and internode tissues, cellulose synthase  4 (CesA4), cellulose synthase 5 (CesA5), and cellulose synthase 6 (CesA6) were the most stable reference genes but they were unstable in growing and mature xylem tissues [36].
Unlike conventional verification experiments for internal reference genes, this study did not directly select commonly used housekeeping genes as experimental genes. Instead, the FPKM algorithm was used to calculate gene expression stability in the transcriptome data of various potato tissues [37]. Based on the transcriptome data, genes were relatively, moderately, or highly expressed in the DM and RH strains. According to the CVs, the genes of DM and RH were screened separately.
The screened genes were considered putative reference genes for qRT-PCR experimental validation and the stabilities of their expression were analyzed using 4 common internal reference gene analysis programs, NormFinder, geNorm, BestKeeper, and RefFinder. NormFinder, geNorm, and BestKeeper are 3 programs that analyze putative reference genes based on different algorithms to select a suitable internal reference gene for expression studies under different experimental conditions. NormFinder and geNorm process data by using the 2 -ΔCt method to convert the original Ct values of different genes from different samples. Moreover, geNorm, also selects the best number of genes based on V values. According to previous studies, the number of internal reference genes should be based on the Vn/n + 1 formula. If only 1 internal reference gene is selected, it may cause errors in the experimental results [6]. Although NormFinder can compare the expression differences of candidate internal reference genes and calculate the variation between sample groups, this program can only select the most suitable internal reference gene [34]. BestKeeper implemented calculations directly using the Ct values obtained from the qRT-PCR analyses, which were different from geNorm and NormFinder [29]. Although previous studies have shown that different analytical methods can produce the same results [38], the results of the analysis are not always the same [27,39,40], which may be because the stability of gene expression was based on different foundations. Therefore, based on the differences in the results obtained by different analytical algorithms in this study, the web-based tool RefFinder, was used to comprehensively evaluate and screen ideal internal reference genes to ensure accuracy and dependability.
The 4 programs used in this study have been widely used for selecting internal reference genes in plants, animals and microorganisms [40]. In this study, it was not always appropriate to select EF1α as the reference gene in the qRT-PCR. Based on the results, although EF1α ranked in the top 5 genes based on the 4 programs, it was not always ranked first, indicating that it was not the most appropriate internal reference gene. geNorm revealed that 2 genes, RPS15 and RPS9, were stably expressed in potato leaves, flowers, stolons, and tubers. Based on the V values, the appropriate number of reference genes was 2. The NormFinder results revealed that the best gene combination was RPL19 and CAM. BestKeeper determined that the 2 most stably expressed genes in the 4 tissues were TrxP1 and EF1α. Based on the RefFinder analysis, RPL19 and RPS15 were the most stable reference genes in the potato tissues. It should be noted that the integrated results obtained by these different analysis methods were different from one another. However, based on a comprehensive analysis of the 4 programs, RPL19 and EF1α were better than the other genes under different analyses and ranked in the top 4 genes in all 4 programs (Tab. 3). Previous studies have suggested that when performing gene expression analyses, compared to using a single reference gene to normalize the expression value of target genes, selecting 2 or more internal reference genes is conducive to obtaining accurate and reliable results [19]. Therefore, based on the findings of this study, RPL19 and EF1α are recommended as internal reference genes for the future qRT-PCR analyses on potato tissues.

Conclusions
In this study, the stability of reference genes in various potato tissues were analyzed, and Ribosomal protein L19 (RPL19) and Elongation factor 1-alpha (EF1α) were selected to normalize the gene expression results of qRT-PCR on potato tissues. It provides suitable internal reference genes for the future research of potato and also lays indirect foundation for the study of potato genetics.