Introduction

The advent of next-generation sequencing technologies has resulted in a significant increase in transcriptomic data for various organisms1. The generated transcriptomes have helped researchers not only to decipher expression pattern of genes and transcripts but also define the genetic architecture of many species2,3. Validation of gene expression from such transcriptomic resources has become mandatory for reporting and reconfirming expression profiles4 and reverse transcription quantitative PCR (RT-qPCR) is rapidly replacing traditional methods such as Northern blotting and Ribonuclease Protection Assays (RPA)5,6. RT-qPCR's speed, sensitivity, efficiency and reproducibility has made it the gold standard for rapid and accurate quantification of gene expression profiles7 from various next-generation sequencing (NGS) datasets. Commercially available instruments and consumables have further led to universal acceptance of RT-qPCR6.

Gene expression using RT-qPCR assay is based on the principle of quantifying target mRNA during the exponential phase of the PCR. During this phase, the target is doubled with each PCR cycle and the probability of PCR-bias due to limited reagents is nullified or decreased. In RT-qPCR, the amplification of product is detected on accumulation of fluorescent signal. The cycle at which the fluorescent signal exceeds background signals is referred to as the threshold cycle, or Ct (also referred to as the quantification cycle, or Cq). Analysis of Cq is used to estimate expression of the respective genes. Several factors, such as RNA quality and quantity, mastermix components used in the PCR and efficiency of PCR reaction, influence Cq values8. Absolute and relative quantification methods are generally used to estimate gene expression using RT-qPCR. The absolute quantification approach necessitates generating the standard curve of known copy numbers of each target, which in turn requires standard curves for multiple targets in a study and knowledge of copy numbers of each target, thus limiting its usage. The most widely adopted approach, relative quantification, is based on estimation of gene expression normalized to the expression of a control gene known as a reference; therefore, reliable determination of reference/references is the central factor in accuracy of this method.

A gene can be used as a reference when it is highly and stably expressed in all samples under investigation and is not co-regulated with a target gene9,10. Reference genes traditionally have been housekeeping genes believed to possess inherent stable and constitutive expression irrespective of physiological conditions in different samples or treatments under investigation11,12. The universal stability of housekeeping gene expression was disproven in recent years13,14, negating their use15. According to the Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines16, reference gene(s) are now selected based on their specificity in interactions between a species or cell type/s subjected to different treatments or conditions.

The Russian wheat aphid (RWA), Diuraphisnoxia Kurdjumov, originally from Central Asia, was introduced globally in the 1900s17 and has become a major destructive pest of wheat and barley and caused huge economic losses. Host-plant resistance is the most acceptable and ecologically beneficial aphid pest management strategy in wheat growing regions, yet host-plant resistance is transitory in many cases, because virulent populations develop to survive in once cultivated resistant varieties18. The short-term solution to this problem traditionally has been to identify new sources of resistance and breed new RWA-resistant wheat varieties, but new technologies such as NGS and RT-qPCR offer the possibility of understanding the molecular principles supporting RWA virulence and engineering resistant plants with greater longevity. Our research is focused on investigating the molecular mechanisms of RWA-plant interaction with the long-term goal of finding novel and durable solutions to RWA management.

The objective of this investigation was to identify a robust RWA reference gene for use in validation of gene expression studies in RWA-wheat interactions. We identified four previously unreported RWA sequences commonly used as reference controls in other biological systems and report the most suitable reference controls for RT-qPCR assays of RWA genes expressed in aphid-wheat interactions.

Results and Discussion

Analysis of RNA quality, sequencing and sequence analysis

Bioanalyzer analysis of RNA revealed a ribosomal shift due to denaturation of 28 s rRNA (see Supplementary Fig. 1 online), consistent with previously published examples of insect RNA19. Multiple peaks were absent, indicating no RNA degradation. Thus, screening total RNA using NanoDrop spectrophotometry, gel electrophoresis and capillary electrophoresis produced high-quality RNA for RNAseq library preparation. The libraries analyzed using the Bioanalyzer revealed fragment sizes of >230 bp. The quality of the 18 libraries (6 treatments × 3 replications) is shown in Supplementary Fig. 2 online. Libraries were sequenced and filtered to remove low-quality sequences, providing >10 million reads for each library. Results of BLAST analysis and differential gene/transcript analysis will be reported separately.

Gene selection and RT-qPCR

The commonly used reference genes actin, ribosomal protein L9, ribosomal protein L27, transcription elongation factor 2 and ribosomal protein L5 were selected based on a literature survey. Actin has been used as a reference gene in many studies of model insects such as the fruit fly, Drosophila melanogaster Meigen20. Transcription elongation factor genes were evaluated to check their suitability as reference genes in several insects, particularly Hemipterans such as the sweetpotato whitefly, Bemisia tabaci (Gennadius)21 and brown plant hopper, Nilaparvata lugens (Stål)22. Genes coding for ribosomal proteins have been reported as most reliable reference in insects such as N. lugens22. Many transcripts coded for these genes were identified from RNAseq data, revealing >93% homology to the respective genes in the pea aphid, Acyrthosiphonpisum (Harris) (Table 1). Representations of these genes observed in RNAseq data confirmed that there was little or no change in their expression. These genes were amplified from cDNAs, yielding amplicon sizes ranging from 220 bp to 553 bp (Table 1). These sequences have been deposited in DNA Databank of Japan with accession numbers AB914563-AB914566.

Table 1 Primers designed for amplification and sequencing of selected D. noxia genes

Amplicon size and melting temperature of RT-qPCR primers designed from these sequences are shown Table 2. Primer efficiency ranged from 91.50 to 102.65, with corresponding R2 values from 0.9922 to 0.9995) (Table 2). Primers were redesigned and efficiencies rechecked in cases where the efficiency or R2 values differed from the range of recommended values. Melt curve analysis of all genes showed no primer dimers or nonspecific product amplification.

Table 2 Primers designed for amplification and RT-qPCR of selected genes from D.noxia feeding on different wheat genotypes

Analysis using geNorm software

Cq values derived from RT-qPCR assay for expression stability were logarithmically transformed as input for geNorm analysis. The principle behind this algorithm is that if two genes are stably expressed in a sample set then the ratio of their logarithmic transformed expression should be constant; thus, geNorm ranks genes based on their average expression stability (M) and the candidate gene possessing the lowest M value is the most stably expressed gene in that set. Figure 1 depicts the average expression stability of actin (M = 0.210) and ribosomal protein L27 (M = 0.267), the two most stable genes that should be used as references in RT-qPCR assays involving D. noxia-wheat interactions.

Figure 1
figure 1

The average expression stability values (M) derived using geNorm software for the candidate genes actin, ribosomal protein L5 (RPL5), ribosomal protein L9 (RPL9), translation elongation factor 2 (TEF2) and ribosomal protein L27 (RPL27).

The least stable candidate gene is plotted on the left and the most stable gene is on the right.

geNorm also predicts an optimal number of reference genes for accurate representation of gene expression based on calculation of pairwise variation (Vn/Vn + 1) between sequential normalization factors (NFn and NFn + 1). A cutoff value of 0.15 is considered for the ratio, below which there is no requirement of any other reference gene13. Large pairwise variation represents a significant effect in gene expression due to the addition of another gene or genes, reinforcing the need for the second gene to be included to derive reliable normalization factors. Candidate genes are added based on the ranking derived using the M value. This analysis, depicted in Figure 2 for our experiments, clearly showed that adding a third gene does not increase the ratio by more than 0.15; therefore, actin and ribosomal protein L27 represent the required reference genes for accurate estimation of D. noxia gene expression in aphid-wheat interactions.

Figure 2
figure 2

geNorm pairwise variation (V) analysis to determine optimal number of reference genes for normalization in RT-qPCR reaction.

Analysis by NormFinder software

NormFinder analyzes candidate reference genes according to inter- and intra-group variation in expression. Similar results were obtained using NormFinder, which predicted stability value of 0.735 for ribosomal protein L27 followed by stability value of 0.738 for actin (Figure 3). Therefore, both geNorm and NormFinder outputs provide proof for stable ribosomal protein L27 gene that can be used as a reference in all the RT-qPCR assays that studies expression patterns of genes in the Russian wheat aphid feeding on different wheat varieties containing aphid resistance genes.

Figure 3
figure 3

NormFinder stability index for actin, ribosomal protein L5 (RPL5), ribosomal protein L9 (RPL9), translation elongation factor 2 (TEF2) and ribosomal protein L27 (RPL27).

Genes with the least stable expression are on the left and the most stable genes are on the right.

BestKeeper analysis software

BestKeeper considers the Cq values of all candidate reference genes, in order to calculate standard deviation (SD) and coefficient of variation (CV). The software excludes genes with Cq value SDs greater than 1. Results of geNorm and NormFinder analyses were further verified when Cq data were analysed using BestKeeper software, which predicts gene stability based on low CV and SD. Ribosomal protein L27 (CV = 2.48 ± 0.47%) and actin (2.85 ± 0.51%) were found to be the most stable (Table 3).

Table 3 Descriptive statistics based on crossing point (CP) data and expression stability of five D. noxia reference genes calculated by BestKeeper Software

Validation of reference genes

Results of the current study indicate actin and ribosomal L27 to be the most appropriate reference genes in RT-qPCR assays involving RWA-wheat interactions. To validate their use, we assessed expression of tRNA-Leu, which we previously reported to be up-regulated in the gut transcriptome of RWA2 fed Dn4 plants23. Results of this experiment revealed significant (p < 0.001) over-expression of tRNA-Leu in RWA2/Dn4 interactions in comparison to other interactions (Figure 4) and very stable CV- and M values (Mean CV = 0.168, Mean M = 0.48) for the actin - ribosomal L27 combination. In contrast, use of the actin - RPL5 combination (Mean CV = 0.97, Mean M = 4.2) or the RPL27 - RPL5 combination (Mean CV = 0.952, Mean M = 3.95) - resulted in huge changes in expression and Mean CV- and M values above acceptable stability values [homogenous (CV < 0.25; M < 0.5) and heterogeneous (CV < 0.5; M < 1)]. Thus, the results of the tRNA-Leu expression validation experiment confirm the use of actin and RPL27 for RWA expression analysis.

Figure 4
figure 4

Relative normalized expression of the t-RNA-Leu gene using reference gene combinations of: (a) actin and ribosomal protein L27 (RPL27), (b) actin and ribosomal protein L5 (RPL5) and (c) RPL27 and RPL5.

R1D0 = RWA1 fed wheat plants carrying Dn0 (no resistance) gene; R1D4 = RWA1 fed wheat plants carrying the Dn4 resistance gene; R1D7 = RWA1 fed wheat plants carrying the Dn7 resistance gene; R2D0 = RWA2 fed wheat plants carrying Dn0 (no resistance) gene; R2D4 = RWA2 fed wheat plants carrying the Dn4 resistance gene; R2D7 = RWA2 fed wheat plants carrying the Dn7 resistance gene.

Conclusions

This study emphasizes the use of an appropriate reference gene(s) for RT-qPCR studies, reports sequences of five selected genes from the Russian wheat aphid and identifies reference genes with stable expression in RWA feeding on wheat plants carrying different RWA resistance genes. Analyses of our data using three different statistical techniques indicated that RWA ribosomal protein L27 (RPL27) is the most stable reference gene, closely followed by actin. Ideally, both RPL27 and actin genes should be used as references for normalizing expression profiles in RT-qPCR studies. This study is an important step in our ongoing RNAseq studies to define differential expression of RWA genes in RWA-wheat interactions. Results of these experiments will be of utmost importance to RWA genome sequencing efforts as well as the huge amount of expression data derived from current studies using NGS technologies.

Methods

Insect and plant material

RWA1 (biotype 1) collected from wheat fields near Hays, KS and RWA2 (biotype 2) individuals collected from wheat fields near Briggsdale, CO (via the USDA-ARS Plant Science Research Laboratory at Stillwater, OK), were cultured in separate housing in the greenhouse at Kansas State University on plants of susceptible wheat cultivar ‘Jagger’ for use in all experiments. The identity of each biotype was verified independently in diagnostic plant differential greenhouse assays at Stillwater and Manhattan. Voucher specimen no. 176 (RWA2) is deposited with the Kansas State University Museum of Entomological and Prairie Arthropod Research.

RWA1 and RWA2 adults were starved for 12 h, then fed on wheat plants containing RWA-resistant genes Dn4 (Yumar), Dn7 (94M370) and the susceptible control Yuma (Dn0) in a fine screen mesh cage in the greenhouse. All plants were grown in the greenhouse in 16.5-cm-diameter-plastic pots containing Pro-Mix-Bx potting mix (Premier ProMix, Lansing, MI). Environmental conditions were 24:20°C day/night and a photoperiod of 14:10 [L:D] h). Aphids were collected from plants at 24, 48, 72 and 96 h post-feeding, stored immediately in RNAlater solution (Qiagen, GmbH, Hilden, Germany), incubated overnight at 4°C and frozen to stabilize total RNA according to manufacturer's recommendations.

D. noxia RNA isolation and quality check

Total RNA was isolated from each sample (24, 48, 72 and 96 h post infestation) separately. Silica matrix-based RNA extraction was performed using the RNeasy Plus Kit (Qiagen). Tissues were homogenized using a micropestle according to the manufacturer's recommendations for isolation with some modifications. The gDNA eliminator column was not used for DNase digestion. Instead, an extra on-column-based digestion was performed using an RNase-free DNase set (Qiagen). Purity and integrity of isolated total RNA was assessed by three different methods, including absorbance at 260 and 280 nm in a NanoDrop spectrophotometer (Thermoscientific, Wilmington, DE), 1% agarose (RNase-free grade) gel electrophoresis using GelGreen staining (Biotium Inc., Hayward, CA) and capillary electrophoresis using an RNA Nano Lab-Chip (Agilent, Santa Clara, CA) and an Agilent 2100 Bioanalyzer system24.

Total RNAs qualifying according to recommended NanoDrop spectrophotometry quality standards were subjected to agarose gel electrophoresis and samples with intact bands were processed in the Agilent 2100 Bioanalyzer at the Kansas State University Integrated Genomics Facility in Manhattan, KS. The Bioanalyzer provided details about RNA integrity summarized as a RNA integrity number (RIN) (from 1--10; maximum-no degradation) and rRNA (28 s/18 s) ratio (~2 means very high quality RNA)25. Total RNA collected at 24, 48, 72 and 96 h post-feeding by each biotype was pooled to form the treatments RWA1 fed Dn0 (RWA1/Dn0), RWA1 fed Dn4 (RWA1/Dn4), RWA1 fed Dn7 (RWA1/Dn7), RWA2 fed Dn0 (RWA2/Dn0), RWA2 fed Dn4 (RWA2/Dn4) and RWA2 fed Dn7 (RWA2/Dn7).

TrueSeq library construction, sequencing and bioinformatics analysis

mRNA was purified, using oligo-dT beads, from 1 μg of total RNA of each treatment-replication sample provided in TruSeq RNA Sample Preparation Kit v2 (Illumina Inc., San Diego, CA). After purification, first and second strand cDNA was synthesized, 3′ ends were repaired and adenylated and DNA fragments were enriched after adapter ligation and indexing. The generated library was validated and pooled based on the specific index or bar-code for sequencing. Samples were loaded in two lanes and sequenced in a Hiseq 2500 Illumina sequencer at the University of Kansas Medical Center in Kansas City, KS.

Raw sequences generated from sequencing were de-multiplexed to define their sample of origin. Fastq output files of each sample were assessed for quality control (QC) using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and de novo assembly of sequences was conducted using Trinity software (http://www.trinity-software.com). Transcript and gene abundance were estimated and expression was evaluated using DESeq (differential gene expression analysis) and EdgeR (empirical analysis of digital gene expression), both from Bioconductor Open Source Software for Bioinformatics (http://bioconductor.org/).

cDNA synthesis and selection of candidate reference genes

cDNA used for PCR standardization and gene amplification was synthesized using Superscript III First-strand Synthesis Supermix (Life Technologies, NY) and oligodT primers according to manufacturer's protocol. For RT-qPCR, an equal amount of total RNA (500 ng) from each sample was used for cDNA synthesis, using the iScript Reverse Transcription Supermix for RT-qPCR (Bio-rad, Hercules, CA).

Candidate reference genes were selected based on literature survey and identification in the library. Primers were designed using Primer 3 (http://primer3.ut.ee/) (Table 1). The PCR profile used was 94°C for 30 s, 30 cycles of 94°C for 30 s followed by 55–58°C and 72°C, each for 30 s. The amplicons were sequenced in the Kansas State University Integrated Genomics Facility.

RT-qPCR primer design and test of amplification efficiency

Primers of sequenced transcripts considered for RT-qPCR were designed using Primer Express Software V.3.0.1 (Life Technologies) with slight modifications of the default parameters (Table 2) to avoid hairpin, self-dimer and cross-dimer secondary structures. Primers were selected based on low penalty scores, which indicated a good match to the set parameter settings.

Amplification efficiency of primers was estimated by SYBR Green chemistry RT-qPCR assay using 1, 10−1, 10−2, 10−3 and 10−4-fold dilutions of pooled cDNAs of three technical replicates for each gene. Each PCR mixture consisted of 1 μl of cDNA, 250 mM of primer and 12.0 μl of iTaq Universal SYBR Green Supermix (Bio-Rad Laboratories) in a total volume of 20 μl. The RT-qPCR profile used was 95°C for 3 min followed by 40 cycles of 95°C for 10 s, 55°C for 30 s and 72°C for 30 s. After RT-qPCR, melt curve analysis was also performed to check for unwanted products or contaminants during the PCR reaction. Three technical replicates were used for standard curve generation. The fluorescence signals were captured, quantification cycle (Cq) values were tabulated and each mean Cq was plotted against the logarithm of the pooled cDNA dilution factor. The slope in a linear regression model was used to determine the efficiency (E = [10(1/−slope) − 1] × 100%) for each primer. Correlation coefficient (R2) values were calculated using the standard curve.

RT-qPCR for calculation of expression stability and statistical analysis

The expression stability of the selected genes was calculated in RT-qPCR assay with selected primers and PCR conditions stated above. Each PCR mixture constituted of 1 μl of cDNA from each treatment sample, 250 mM of each primer and 12.0 μl of iTaq Universal SYBR Green Supermix (Bio-Rad Laboratories) in a total volume of 20 μl. The RT-qPCR profile was 95°C for 3 min followed by 40 cycles of 95°C for 10 s, 55°C for 30 s and 72°C for 30 s. Each reaction was performed for three technical replicates and three biological replicates in 96-well optical grade PCR plates sealed with optical sealing tape (Bio-Rad Laboratories). Melt curves were generated for each sample by heating the PCR amplicon from 65°C to 95°C with a gradual increase of 0.5°C per cycle of 0.5 s.

The Microsoft Excel-based software geNorm13, NormFinder26 and BestKeeper27 was used to rank reference genes for stability of expression across all experimental samples. For geNorm and NormFinder analyses, Cq values were transformed to relative quantities using the formula 2−ΔCq, where ΔCq = Cq of the gene in selected sample - minimum Cq of corresponding gene in the experiment. The sample with minimum Cq or maximum expression was used as the calibrator with a set value of 1. No transformed Cq values are required for BestKeeper analysis. Other procedures were followed according to the instructions mentioned in the software manual.