Selection of housekeeping genes as internal controls for quantitative RT-PCR analysis of the veined rapa whelk (Rapana venosa)

Background The veined rapa whelk Rapana venosa is an important commercial shellfish in China and quantitative real-time PCR (qRT-PCR) has become the standard method to study gene expression in R. venosa. For accurate and reliable gene expression results, qRT-PCR assays require housekeeping genes as internal controls, which display highly uniform expression in different tissues or stages of development. However, to date no studies have validated housekeeping genes in R. venosa for use as internal controls for qRT-PCR. Methods In this study, we selected the following 13 candidate genes for suitability as internal controls: elongation factor-1α (EF-1α), α-actin (ACT), cytochrome c oxidase subunit 1 (COX1), nicotinamide adenine dinucleotide dehydrogenase (ubiquinone) 1α subcomplex subunit 7 (NDUFA7), 60S ribosomal protein L5 (RL5), 60S ribosomal protein L28 (RL28), glyceraldehyde 3-phosphate dehydrogenase (GAPDH), β-tubulin (TUBB), 40S ribosomal protein S25 (RS25), 40S ribosomal protein S8 (RS8), ubiquitin-conjugating enzyme E2 (UBE2), histone H3 (HH3), and peptidyl-prolyl cis-trans isomerase A (PPIA). We measured the expression levels of these 13 candidate internal controls in eight different tissues and twelve larvae developmental stages by qRT-PCR. Further analysis of the expression stability of the tested genes was performed using GeNorm and RefFinder algorithms. Results Of the 13 candidate genes tested, we found that EF-1α was the most stable internal control gene in almost all adult tissue samples investigated with RL5 and RL28 as secondary choices. For the normalization of a single specific tissue, we suggested that EF-1α and NDUFA7 are the best combination in gonad, as well as COX1 and RL28 for intestine, EF-1α and RL5 for kidney, EF-1α and COX1 for gill, EF-1α and RL28 for Leiblein and mantle, EF-1α, RL5, and NDUFA7 for liver, GAPDH, PPIA, and RL28 for hemocyte. From a developmental perspective, we found that RL28 was the most stable gene in all developmental stages measured, and COX1 and RL5 were appropriate secondary choices. For the specific developmental stage, we recommended the following combination for normalization, PPIA, RS25, and RL28 for stage 1, RL5 and RL28 for stage 2 and 5, RL28 and NDUFA7 for stage 3, and PPIA and TUBB for stage 4. Discussion Our results are instrumental for the selection of appropriately validated housekeeping genes for use as internal controls for gene expression studies in adult tissues or larval development of R. venosa in the future.

138 The expression stability of 13 candidate housekeeping genes among the different RNA samples 139 was calculated using the Excel-based tool GeNorm v3.4 (https://genorm.cmgg.be/) and overall 140 stability of these candidate genes was determined using RefFinder 141 (http://fulxie.0fees.us/?type=reference&i=1). We used this suite of tools to ensure a statistically 142 thorough analysis and robust identification of housekeeping genes for use in qRT-PCR of R. 143 venosa. 144 GeNorm evaluates gene stability (M) of inputted genes using a statistical algorithm according 145 to geometric averaging of multiple control genes and means pairwise variation of a gene from 146  ). In addition, GeNorm is used to determine the optimal number of housekeeping genes by 153 pairwise number variation analysis. It computes the geometric mean of the selected genes that are 154 expressed steadily for accurate normalization. A pairwise variation below 0.15, which is 155 determined by Vn/n + 1, means that an added control gene (n + 1) would not further improve the 156 normalization factor (Vandesompele et al. 2002). 214 The GeNorm-derived M values of candidate housekeeping genes in all tissues and for each 215 tissue are shown in Fig. 2. We found that EF-1α and RL5 (both M = 0.52) showed the highest 216 stability in all tissues as well as in gonad, kidney, and liver. However, we found that EF-1α and 217 RL28 were the best combination of two internal control genes for gill, intestine, and mantle, 218 whereas the best control gene pairs for hemocyte and Leiblein were PPIA and HH3, and RL28 and 219 UBE2, respectively. These results indicate that EF-1α, RL5, and RL28 are the most appropriate 220 internal control genes for most tissues.
221 Fig. 2 also shows a ranking of the stability values calculated by NormFinder for tissue-specific 222 housekeeping genes. The best housekeeping genes on the basis of all tissues together was EF-1α 223 with a stability value of 0.63 as well as in Leiblein and mantle (0.16 and 0.10, respectively), 224 whereas in the gill and intestine it was COX1 (0.08 and 0.20, respectively), in hemocyte and gonad 225 it was RL28 (0.68 and 0.84, respectively), and in liver and kidney it was both RL5 and EF-1α (both 226 0.17 and 0.05, respectively). Based on these findings, EF-1α is an appropriate stable internal 227 control gene for all tissues analyzed either together or separately. In addition, RL5, COX1, and 228 RL28 can be used for their respective tissue-specific analyses.

229
Using SD values generated from BestKeeper for each of the various housekeeping genes, we 230 found major differences in each tissue (shown in Tables 3 and 4). RL28 was identified as the best 231 gene for all tissues together and in liver, whereas RS25 was identified for use in gill and mantle, 232 TUBB in hemocyte and Leiblein, HH3 in intestine, PPIA in kidney, and NDUFA7 in gonad. 233 However, as ranked by r, we found that EF-1α was the most stably expressed gene in all tissues 234 together and in most separate tissues, namely, gill, Leiblein, mantle, gonad, and kidney, although 235 EF-1α and RL5 had the same rank position in kidney. In addition, RL5 was the best gene in 236 hemocyte, whereas COX1 and PPIA were ideal for intestine and liver, respectively. Therefore, 237 based on our findings with r, EF-1α was the most stable housekeeping gene in most tissue samples.
238 Gene expression stability analysis in developmental stages 239 In this study, using five specimens, we identified 13 candidate reference genes and tested them for 240 expression stability in 12 different developmental larval stages. For data description and display 241 in tables and figures, we merged these 12 developmental larval stages into five groups/stages 242 according to their developmental characteristics: stage 1 (L, M, N); stage 2 (R, S, T); stage 3 (C, 243 D, F); stage 4 (G, J); and stage 5 (Y). 244 We analyzed the Cq values obtained from qRT-PCR and calculated variation among the 245 candidate housekeeping genes evaluated in the samples. In all stages combined, the average Cq of 246 the 13 genes ranged from 19.26 to 30.36 (Table 5). We found that EF-1α and COX1 had the lowest 247 mean Cq values, which represented the highest expression level, in both the total for all stages and 248 separately for each of the five stages, whereas we found that HH3 had the highest mean Cq values 249 in all of the different stages, which indicates it had the lowest expression levels. Interestingly, 250 stages 1 and 2 showed more variation in gene expression compared to that found in the other 251 stages, and that integral Cq values decreased gradually from stage 1 to stage 3 and were then 252 constant (Table S1). According to computed SE values, we found that HH3 (SE = 0.32), EF-1α 253 and NDUFA7 (both SE = 0.34), and GAPDH (SE = 0.35) had the least varying transcript 254 abundance when all stages were analyzed together. However, the tested housekeeping genes 255 showed different expression states in each stage under the same condition. The gene with the 256 lowest SE value for each of the five stages is RL28 (SE = 0.31) and COX1 (SE = 0.32) in stage 1, 257 PPIA (SE = 0.62) in stage 2, TUBB and RS25 (both SE = 0.15) in stage 3, PPIA and TUBB (both 258 SE = 0.10) in stage 4, and RS25 (SE = 0.19) in stage 5. These findings illustrate that no single 259 candidate housekeeping gene was expressed at a stable level on the basis of Cq value only across 260 five stages from different R. venosa larvae samples, and therefore, further statistical analyses were 261 required to identify the best housekeeping genes.

262
GeNorm-derived M values of the candidate housekeeping genes for all stages together and 263 stage-specific are shown in Fig. 3. COX1 and RL28 (both M = 0.34) have the highest stability in 264 all the stages together. However, we found that UBE2 and PPIA were the most suitable 265 combination of two internal controls for stage 1, RL5 and RL28 were optimal for stage 2, NDUFA7 266 and RL28 in stage 3, RS25 and PPIA in stage 4, and COX1 and RL5 in stage 5. These results 267 indicate that RL28 is the most appropriate internal control gene for most of the different stages 268 examined. In addition, Fig. 3 shows the ranked stability value calculated by NormFinder of the 269 tested candidate housekeeping genes, and we found that the best housekeeping gene for all 270 combined stages was RL28 with a stability value of 0.188. It was also the best gene for stages 1 271 and 3 (0.277 and 0.228, respectively), whereas RL5 was the best gene for stages 2 and 5 with 272 respective stability values of 0.142 and 0.112, and TUBB (0.022) for stage 4. Based on these 273 findings, RL28 and RL5 are the appropriate stable internal control genes for most developmental 274 stages.

275
Based on SD values generated from BestKeeper, we found that various housekeeping genes 276 manifested major differences in expression in each stage (Tables 6 and 4). We identified HH3 as 277 the best gene for all stages together, whereas the best stage-specific genes were RL28 for stage 1, 278 NDUFA7 for stage 2, TUBB for stage 3, and RS25 for stages 4 and 5. However, when ranking by 279 r, we found that RL28 was the most stably expressed gene when all stages were combined and for 280 stage 3 with PPIA, RL5, TUBB, and GAPDH being the ideal genes for stages 1, 2, 4, and 5, 281 respectively.
282 Determination of the optimal number of internal controls for normalization 283 In regards to the tissue-specific pairwise variation calculated by GeNorm (Fig. 4), we found that 284 in most tissues (gill, gonad, intestine, Leiblein, kidney, and mantle) the V4/5 value of 0.141 285 indicated EF-1α and RL5 are insufficient for normalization and that RL28 and NDUFA7 should be 286 included; however, the V2/3 value was under 0.15, which suggests two internal control genes were 287 sufficient. In liver, the V2/3 value (0.148) was comparable to the cut-off value although the V3/4 288 value (0.092) was sufficiently low, which indicates the inclusion of a third reference gene is needed 289 to improve stability of normalization (Fig.S2). We found that none of the pairwise variations 290 determined before V10/11 was less than 0.15 in hemocyte, and therefore, over 10 genes are suitable 291 as reference genes based on the determined conditions ( Figure S4). In terms of pairwise variation 292 for all developmental stages (Fig. 4), we found that the V2/3 value was below threshold (0.15), 293 which suggests RL28 and COX1 (on the basis of the M value) are sufficient for normalization, and 294 that inclusion of an additional reference gene is not required in most stages with the exception of 295 stage 1. We found in stage 1 that the V2/3 value exceeds threshold, whereas V3/4 is below 296 threshold, which indicates the necessity of adding a third reference gene (RS25 based on M value) 297 to improve the robustness of normalization (Fig.S3). 299 The commercial importance and ecological role of R. venosa have driven increased molecular 300 research towards investigating the morphology and biology of this organism, which may 301 commonly use qRT-PCR as a tool to study gene expression (Lusumin et al. 2008; Samadi & 302 Steiner 2009). It is imperative to study the expression patterns of specific genes in different larval 303 developmental stages and adult tissues in R. venosa, and qRT-PCR is a demonstrably powerful 304 tool to analyze such gene expression. Nevertheless, internal controls are critical to obtain reliable 305 normalization of gene expression, and in turn, robust results from qRT-PCR analysis. The ideal 306 internal control gene is characterized by stable expression across different environmental 307 conditions and physiological states, such as different developmental stages and tissue types. 308 However, according to findings from many related studies, housekeeping genes have variable 309 expression changing under different experimental conditions. Thus, no gene has stable expression 310 under all experimental conditions. As a consequence, using a single control gene in all 311 experimental conditions could influence the accuracy of normalization of gene expression results 312 (Jain et al. 2006). Therefore, evaluating the level of candidate housekeeping genes is a vital 313 preliminary effort to reliably quantifying target genes by qRT-PCR (Bustin 2009; Dheda et al. 314 2005). Prior to our study, there has been no published finding of research that has examined the 315 suitability of potential housekeeping genes for gene expression analysis in R. venosa. We selected 316 13 genes as candidate internal controls from high-throughput RNA-seq data of R. venosa larvae, 317 which showed high and stable levels of expression, and most of which were currently being used 318 as internal control genes in studies of Mollusca. The expression levels of these 13 genes were 319 monitored by qRT-PCR in eight different tissues and twelve developmental larval stages. 320 We found from comprehensive analysis using GeNorm, NormFinder, and BestKeeper in eight 321 different tissues analyzed together that EF-1α was the most stable internal control gene followed 322 by RL28. According to the pairwise variation calculated by GeNorm, two additional genes should 323 be added for normalization. Based on the rank determined by NormFinder and BestKeeper, RS8 324 and RS25 are the most appropriate internal controls; however, RL5 and NDUFA7 are determined 325 to be the most appropriate internal controls based on GeNorm. Considering these findings, we 326 conclude that EF-1α, RL28, RL5, and RS8 are the most stable gene combination for R. venosa 327 tissues. EF-1α belongs to the G-protein family, which has a significant influence in protein

344
In addition, using GeNorm, NormFinder, and BestKeeper we identified tissue-specific 345 expression levels of 13 candidate internal controls in eight different tissues to determine the most 346 stable gene for each tissue type. We found that EF-1α and NDUFA7 are the best combination for 347 normalization in gonad. For gill, EF-1α and COX1 are the best combination. COX1, which is 348 encoded from an approximately 650 bp fragment of the mitochondrial gene, is used to identify 349 animals and plants (Evans et al. 2007), but has been rarely used as an internal control of gene 350 expression (Kaweesi et al. 2014). However, in this study, we found it demonstrates relatively stable 351 expression and in fact can be the secondary internal control for R. venosa tissues and 352 developmental stages. In addition, we identified COX1 and RL28 for intestine, EF-1α and RL5 for 353 kidney, and EF-1α and RL28 for Leiblein and mantle. For liver, we determined from our analysis 354 that at least three internal controls are required for reliable normalization, which are EF-1α, RL5, 355 and NDUFA7. In regards to hemocyte, we determined based on pairwise variation that more than 356 10 genes should be used as reference genes, and therefore, we proposed using as many internal 357 controls available including GAPDH, PPIA, and RL28. Because of our comprehensive assessment, 358 it is evident that EF-1α is a good housekeeping gene as an internal control in most R. venosa tissues 359 and for a particular tissue, we recommend including additional corresponding housekeeping genes 360 to improve stability. 405 We identified and evaluated expression stability of 13 housekeeping genes for qRT-PCR 406 normalization in R. venosa tissues and larvae developmental stages. In our assessment of tissue-407 specific genes, EF-1α was the most stable internal control gene in most tissue samples tested with 408 RL5 and RL28 as suitable secondary choices. We found that RL28 was the most stable gene when 409 evaluating all measured developmental stages, and COX1 and RL5 were appropriate secondary 410 choices. To our knowledge, this study was the first to investigate and identify optimal 411 housekeeping genes for relative quantification of qRT-PCR in R. venosa. The results of this study 412 provide not only references to estimate gene expression levels during R. venosa larvae 413 developmental stages, but also enables future research efforts to measure readily and robustly 414 tissue-specific mRNA abundance in R. venosa.

Pairwise variation
664 Figure 4 Determination of the number of reference genes required for accurate normalization. Pairwise variation 665 by GeNorm between candidate genes in tissues (black bar) and developmental stages (white bar). 666