ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Software Tool Article
Revised

Validation of predicted mRNA splicing mutations using high-throughput transcriptome data

[version 2; peer review: 4 approved]
PUBLISHED 07 Apr 2014
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Bioinformatics gateway.

Abstract

Interpretation of variants present in complete genomes or exomes reveals numerous sequence changes, only a fraction of which are likely to be pathogenic. Mutations have been traditionally inferred from allele frequencies and inheritance patterns in such data. Variants predicted to alter mRNA splicing can be validated by manual inspection of transcriptome sequencing data, however this approach is intractable for large datasets. These abnormal mRNA splicing patterns are characterized by reads demonstrating either exon skipping, cryptic splice site use, and high levels of intron inclusion, or combinations of these properties. We present, Veridical, an in silico method for the automatic validation of DNA sequencing variants that alter mRNA splicing. Veridical performs statistically valid comparisons of the normalized read counts of abnormal RNA species in mutant versus non-mutant tissues. This leverages large numbers of control samples to corroborate the consequences of predicted splicing variants in complete genomes and exomes.

Keywords

alternative splicing

Revised Amendments from Version 1

We appreciate the feedback received from the reviewers and we feel that the manuscript has been improved as a result. The following major revisions have been incorporated:

  • Figure 1 has been abbreviated according to the advice of the reviewers.
  • Figure 2 is new. It illustrates the details of the data used to infer different types of splicing events.
  • We have re-formulated the equations in the methods section pertaining to the definitions of the various validating reads, as requested by reviewers, and we reference each formal definition in the text.
  • We have clearly demarcated the performance information and have added sub-headings to improve the readability of our results section.
  • We have coloured all mutant reads in the IGV images to more clearly depict every validating read and its type.
  • We have added paragraphs to the Discussion pertaining to:
    • Achieving sample size sufficiency via power analysis
    • Clarifying that Veridical does not attempt to predict alternative splicing events
    • Clarifying our usage of the term “validation”
We have also updated the Veridical software, as follows:
  • It now outputs p-values to four (instead of two) decimal places.
    • This allows for more precise post-hoc filtering and can allow one to apply more stringent thresholds, as recently advocated (Johnson Proc. Natl. Acad. Sci. 110:19313-7, 2013).
  • The correct behavior of allowing a variant to be present in the filtered set is to check if there are any strongly corroborating, statistically significant, results. Previously, the filtered output file would only consider the p-value of the splicing consequence of highest frequency. This has been fixed.
    • Strongly corroborating reads are defined as: any junction-spanning-based evidence or read-abundance-based intron inclusion (refer to the manuscript for more details).
  • We noticed that variants with a statistically significant number of intron inclusion with mutation reads were erroneously placed into the filtered set, despite the overall number of intron inclusion reads not having achieved statistical significance. We have resolved this regression.
None of the changes to Veridical, nor the aforementioned regressions, affected any of the results presented in our manuscript. For a more detailed explanation of the revisions made please refer to the referees’ comments and our respective replies.

See the authors' detailed response to the review by Stefania Bortoluzzi
See the authors' detailed response to the review by Francesc Xavier Roca
See the authors' detailed response to the review by Liliana D. Florea
See the authors' detailed response to the review by Peter N. Robinson

Introduction

DNA variant analysis of complete genome or exome data has typically relied on filtering of alleles according to population frequency and alterations in coding of amino acids. Numerous variants of unknown significance (VUS) in both coding and non-coding gene regions cannot be categorized with these approaches. To address these limitations, in silico methods that predict biological impact of individual sequence variants on protein coding and gene expression have been developed, which exhibit varying degrees of sensitivity and specificity1. These approaches have generally not been capable of objective, efficient variant analysis on a genome-scale.

Splicing variants, in particular, are known to be a significant cause of human disease25 and indeed have even been hypothesized to be the most frequent cause of hereditary disease6. Computational identification of mRNA splicing mutations within DNA sequencing (DNA-Seq) data has been implemented to varying degrees of sensitivity, with most software only evaluating conservation solely at the intronic dinucleotides adjacent to the junction (i.e.7). Other approaches are capable of detecting significant mutations at other positions with constitutive, and in certain instances, cryptic, splice sites5,8,9 which can result in aberrations in mRNA splicing. Presently, only information theory-based mRNA splicing mutation analysis has been implemented on a genome scale10. Splicing mutations can abrogate recognition of natural, constitutive splice sites (inactivating mutation), weaken their binding affinity (leaky mutation), or alter splicing regulatory protein binding sites that participate in exon definition. The abnormal molecular phenotypes of these mutations comprise: (a) complete exon skipping, (b) reduced efficiency of splicing, (c) failure to remove introns (also termed intron retention or intron inclusion), or (d) cryptic splice site activation, which may define abnormal exon boundaries in transcripts using non-constitutive, proximate sequences, extending or truncating the exon. Some mutations may result in combinations of these molecular phenotypes. Nevertheless, novel or strengthened cryptic sites can be activated independently of any direct effect on the corresponding natural splice site. The prevalence of these splicing events has been determined by ourselves and others5,1113. The diversity of possible molecular phenotypes makes such aberrant splicing challenging to corroborate at the scale required for complete genome (or exome) analyses. This has motivated the development of statistically robust algorithms and software to comprehensively validate the predicted outcomes of splicing mutation analysis.

Putative splicing variants require empirical confirmation based on expression studies from appropriate tissues carrying the mutation, compared with control samples lacking the mutation. In mutations identified from complete genome or exome sequences, corresponding transcriptome analysis based on RNA sequencing (RNA-Seq) is performed to corroborate variants predicted to alter splicing. Manually inspecting a large set of splicing variants of interest with reference to the experimental samples’ RNA-Seq data in a program like the Integrative Genomics Viewer (IGV)14, or simply performing database searches to find existing evidence would be time-consuming for large-scale analyses. Checking control samples would be required to ensure that the variant is not a result of alternative splicing, but is actually causally linked to the variant of interest. Manual inspection of the number of control samples required for statistical power to verify that each displays normal splicing would be laborious and does not easily lend itself to statistical analyses. This may lead to either missing contradictory evidence or to discarding a variant due to the perceived observation of statistically insignificant altered splicing within control samples. In addition, a list of putative splicing variants returned by variant prediction software can often be extremely large. The validation of such a significant quantity of variants may not be feasible, for example, in certain types of cancer, in instances where the genomic mutational load is high and only manual annotation is performed. We have therefore developed Veridical, a software program that automatically searches all given experimental and control RNA-Seq data to validate DNA-derived splicing variants. When adequate expression data are available at the locus carrying the mutation, this approach reveals a comprehensive set of genes exhibiting mRNA splicing defects in complete genomes and exomes. Veridical and its associated software programs are available at: www.veridical.org.

Methods

The program Veridical was developed to allow high-throughput validation of predicted splicing mutations using RNA sequencing data. Veridical requires at least three files to operate: a DNA variant file containing putative mRNA splicing mutations, a file listing of corresponding transcriptome (RNA-Seq) BAM files, and a file annotating exome structure. A separate file listing RNA-Seq BAM files for control samples (i.e. normal tissue) can also be provided. Here, we demonstrate the capabilities of the software for mutations predicted in a set of breast tumours. Veridical compares RNA-Seq data from the same tumours with RNA-Seq data from control samples lacking the predicted mutation. However, in principle, potential splicing mutations for any disease state with available RNA-Seq data can be investigated. In each tumour, every variant is analyzed by checking the informative sequencing reads from the corresponding RNA-Seq experiment for non-constitutive splice isoforms, and comparing these results with the same type of data from all other tumour and normal samples that do not carry the variant in their exomes.

Veridical concomitantly evaluates control samples, providing for an unbiased assessment of splicing variants of potentially diverse phenotypic consequences. Note that control samples include all non-variant containing files (i.e. RNA-Seq files for those tumours without the variant of interest), as well any normal samples provided. Increasing the number of the set of control samples, while computationally more expensive, increases the statistical robustness of the results obtained.

For each variant, Veridical directly analyzes sequence reads aligned to the exons and introns that are predicted to be affected by the genomic variant. We elected to avoid indirect measures of exon skipping, such as loss of heterozygosity in the transcript, because of the possibility of confusion with other molecular etiologies (i.e. deletion or gene conversion), unrelated to the splicing mutations. The nearest natural site is found using the exome annotation file provided, based upon the directionality of the variant, as defined within Table 1. The genomic coordinates of the neighboring exon boundaries are then found and the program proceeds, iterating over all known transcript variants for the given gene. A diagram of this procedure is provided in Figure 1. The variant location, C, is specifically referring to the variant itself. JC refers to the variant-induced location of the predicted mRNA splice site, which is often proximate to, but distinct from the coordinate of the actual genomic mutation itself.

Table 1. Definitions used within Veridical to determine the direction in which reads are checked.

A and B represent natural site positions, defined in Figure 1(B).

Pertinent Splice Site
ABStrandDirection
ExonicDonorα+
ExonicDonorα-
IntronicAcceptorβ+
IntronicAcceptorβ-

 α – 5′ splice site         β – 3′ splice site    

309c02b8-8635-4585-be73-ec5afb2aa435_figure1.gif

Figure 1.

Diagram portraying the definitions used within Veridical to specify genic variant position and read coordinates. We employ the same conventions as IGV14. Blue lines denote genes, wherein thick lines represent exons and thin lines represent introns.

The program uses the BamTools API15 to iterate over all of the reads within a given genomic region across experimental and control samples. Individual reads are then assessed for their corroborating value towards the analysis of the variant being processed, as outlined in the flowchart in Figure 3. Validating reads are based on whether they alter either the location of the splice junction (i.e. junction-spanning) or the abundance of the transcript, particularly in intronic regions (i.e. read-abundance). Junction-spanning reads contain DNA sequences from two adjacent exons or are reads that extend into the intron (Equation 1(e)). These reads directly show whether the intronic sequence is removed or retained by the spliceosome, respectively. Read-abundance validated reads are based upon sequences predicted to be found in the mutated transcript in comparison with sequences that are expected to be excised from the mature transcript in the absence of a mutation (Equation 1(f)). Both types of reads can be used to validate cryptic splicing, exon skipping, or intron inclusion. A read is said to corroborate cryptic splicing if and only if the variant under consideration is expected to activate cryptic splicing. Junction-spanning, cryptic splicing reads are those in which a read is exactly split from the cryptic splice site to the adjacent exon junction (Equation 1(a)). For read-abundance cryptic splicing, we define the concept of a read fraction, which is the ratio of the number of reads corroborating the cryptically spliced isoform and the number of reads that do not support the use of the cryptic splice site (i.e. non-cryptic corroborating) in the same genomic region of a sample. Cryptic corroborating reads are those which occur within the expected region where cryptic splicing occurs (i.e. spliced-in regions). This region is bounded by the variant splice site location and the adjacent (direction dependent) splice junction (Equation 1(a)). Non-cryptic corroborating reads, which we also term “anti-cryptic” reads, are those that do not lie within this region, but would still be retained within the portion that would be excised, had cryptic splicing occurred (Equation 1(b)). To identify instances of exon skipping, Veridical only employs junction-spanning reads. A read is considered to corroborate exon skipping if the connecting read segments are split such that it connects two exon boundaries, skipping an exon in between (Equation 1(c)). A read is considered to corroborate intron inclusion when the read is continuous and either overlaps with the intron-exon boundary (and is then said to be junction-spanning) or if the read is within an intron (and is then said to be based upon read-abundance). We only consider an intron inclusion read to be junction spanning if it spans the relevant splice junction, A. Equation 1(d) formalizes this concept. We occasionally use the term “total intron inclusion” to denote that any such count of intron inclusion reads includes both those containing and not containing the mutation itself. Graphical examples of some of these validation events, with a defined variant location, are provided in Figure 2.

309c02b8-8635-4585-be73-ec5afb2aa435_figure2.gif

Figure 2.

Illustrative examples of aberrant splicing detection. Grey lines denote reads, wherein thick lines denote a read mapping to genomic sequence and thin lines represent connecting segments of reads split across spliced-in regions (i.e. exons or included introns). Dotted blue rectangles denote portions of genes which are spliced out in a mutant transcript, but are otherwise present in a normal transcript. Mutant reads are purple if they are junction-spanning and green if they are read-abundance based. Start and end coordinates of reads with two portions are denoted by (rs1, re1) and (rs2, re2), while coordinates of those with only a single portion are denoted by (rs, re). Refer to the caption of Figure 1 for additional graphical element descriptions.

309c02b8-8635-4585-be73-ec5afb2aa435_figure3.gif

Figure 3.

The algorithm employed by Veridical to validate variants. Refer to Table 1 for definitions concerning direction and Figure 1 for variable depictions. B is defined as follows: B (B site left (←) of AB := D. B site right (→) of AB := E.

We proceed to formalize the above descriptions as follows. A given read is denoted by r, with start and end coordinates (rs, re), if the read is continuous, or otherwise, with start and end coordinate pairs, (rs1, re1) and (rs2, re2) as diagrammed within Figure 2. Let be the length of the read. The set ζ denotes the totality of validating reads. The criterion for r ∈ ζ is detailed below. It is important to note that validating reads are necessary but not sufficient to validate a variant. Sufficiency is achieved only if the number of validating reads is statistically significant relative to those present in control samples. ζ itself is partitioned into three sets: ζc, ζe, and ζi for evidence of cryptic splicing, exon skipping, and intron inclusion, respectively. We allow partitions to be empty. Let JC denote the adjacent splice junction, and let B denote the downstream natural site, as defined by Figure 2 and Table 1. Without loss of generality, we consider only the red (i.e. direction is right) set of labels within Figure 1(B), as further typified by Figure 2. Then the (splice consequence) partitions of ζ are given by:

rζc ⇔ variant is cryptic ∧ (rs2re1 = BJC ∨ (rs > JCre < A))        (1a)

rζc ∧ variant is cryptic ∧ ¬ ( rs2re1 = BJC) ⇒ r ∈ anti-cryptic        (1b)

rζe(re1=Drs2=E)        (1c)

rζi ⇔ (A∈ [rs, re]) ∨ ((A ∉ [rs, re]) ∧ rs > Are < B ∧ ¬ (A ∈ [rs, re]))        (1d)

We separately partition ζ by its evidence type, the set of junction-spanning reads, δ and read-abundance reads, α:

rδ ⇔ (A ∈ [rs, re]) ∨ (rζcrs2re1 = BJC)        (1e)

r ∈ α ⇔ rδ        (1f)

Once all validating reads are tallied for both the experimental and control samples, a p-value is computed. This is determined by computing a z-score upon Yeo-Johnson (YJ)16 transformed data. This transformation, shown in Equation 2, ensures that the data is sufficiently normally distributed to be amenable to parametric testing.

ψ(x,λ)={(x+1)λ1λifx0λ0log(x+1)ifx0λ=0(x+1)2λ12λifx<0λ2log(x+1)ifx<0λ=2(2)

The transform is similar to the Box-Cox power transformation, but obviates the requirement of inputting strictly positive values and has more desirable statistical properties. Furthermore, this transformation allowed us to avoid the use of non-parametric testing, which has its own pitfalls regarding assumptions of the underlying data distribution17. We selected λ = 12, because Veridical’s untransformed output is skewed left, due to their being, in general, less validating reads in control samples and the fact that there are, by design, vastly more control samples than experimental samples. We found that this value for λ generally made the distribution much more normal. A comparison of the distributions of untransformed and transformed data is provided in Figure S1. We were not concerned about small departures from normality as a z-test with a large number of samples is robust to such deviations18.

Thus, we can compute the p-value of the pairwise unions of the two sets of partitions of ζ, except the irrelevant ζeα = Ø. We only provide p-values for these pairwise unions and do not attempt to provide p-values for the partitions for the different consequences of the mutations on splicing. While such values would be useful, we do not currently have a robust means to compute them. Our previous work provides guidance on interpretation of splicing mutation outcomes35,10. Thus for ζx ∈ {ζc, ζe, ζi}, let ΦZ (z) represent the cumulative distribution function of the one-sided (right-tailed — i.e. P[X > x]) standard normal distribution. Let N represent the total number of samples and let V represent the set of all ζx validations, across all samples. Then:

μ=j=1NVjNσ=1Nj=1N(VjV¯)2z=|ζx|μσp=Φ(ψ(z,12))

The program outputs two tables, along with summaries thereof. The first table lists all validated read counts across all categories for experimental samples, while the second table does the same for the control samples. P-values are shown in parentheses within the experimental table, which refer to the column-dependent (i.e. the read type is given in the column header) p-value for that read type with respect to that same read type in control samples. The program produces three files: a log file containing all details regarding validated variants, an output file with the programs progress reports and summaries, and a filtered validated variant file. The filtered file contains all validated variants of statistical significance (set as p < 0.05, by default), defined as variants with one or more validating reads achieving statistical significance in a strongly corroborating read type. These categories are limited to all junction-spanning based splicing consequences and read-abundance total intron inclusion. For example, a cryptic variant for which p = 0.04 in the junction-spanning cryptic column would meet this criteria, assuming the default significance threshold.

The p-values given by Veridical are more robust when the program is provided with a large number of samples. The minimum sample size is dependent upon the desired power, α value, and the effect size (ES). The minimum samples size could be computed as follows: N=σ2z2ES2. For α = 0.05 and β = 0.2 (for a power of 0.8): z = 2.4865 for the one-tailed test. Then, N=σ22.48652ES2. Ideally, Veridical could be run with a trial number of samples.

Then, one would compute effect sizes from Veridical’s output. The standard deviation in the above formula could also be estimated from one’s data, although it should be transformed using Yeo-Johnson (such as via an appropriate R package) before computing this estimation.

We elected to use RefSeq19 genes for the exome annotation, as opposed to, the more permissive exome annotation sets, UCSC Known Genes20 or Ensembl21. The large number of transcript variants within Ensembl, in particular, caused many spurious intron inclusion validation events. This occurred because reads were found to be intronic in many cases, when in actuality they were exonic with respect to the more common transcript variant. In addition, the inclusion of the large number of rare transcripts in Ensembl significantly increased program run-time and made validation events much more challenging to interpret unequivocally. The use of RefSeq, which is a conservative annotation of the human exome, resolves these issues. It is possible that some subset of unknown or Ensemble annotated intronic transcripts could be sufficiently prevalent to merit inclusion in our analysis. We do not attempt to perform the difficult task of deciding which of these transcripts would be worth using. Indeed, the task of confirming and annotating of such transcripts is already done by the more conservative annotation we employ.

We also provide an R program22 which produces publication quality histograms displaying embedded Q-Q plots and p-values, to evaluate for normality of the read distribution and statistical significance, respectively. The R program performs the YJ transformation as implemented in the car package23. The histograms generated by the program use the Freedman-Draconis24 rule for break determination, and the Q-Q plots use algorithm Type 8 for their quantile function, as recommended by Hyndman and Fan25. This program is embedded within a Perl script, for better integration into our workflow. Lastly, a Perl program was implemented to automatically retrieve and correctly format an exome annotation file from the UCSC database20 for use in Veridical. All data use hg19/GRCh37, however when new versions of the genome become available, this program can be used to update the annotation file.

Results

Veridical validates predicted mRNA splicing mutations using high-throughput RNA sequencing data. We demonstrate how Veridical and its associated R program are used to validate predicted splicing mutations in somatic breast cancer. Each example depicts a particular variant-induced splicing consequence, analyzed by Veridical, with its corresponding significance level. The relevant primary RNA-Seq data are displayed in IGV, along with histograms and Q-Q plots showing the read distributions for each example. The source data are obtained from controlled-access breast carcinoma data from The Cancer Genome Atlas (TCGA)26. Tumour-normal matched DNA sequencing data from the TCGA consortium was used to predict a set of splicing mutations, and a subset of corresponding RNA sequencing data was analyzed to confirm these predictions with Veridical. Overall, 442 tumour samples and 106 normal samples were analyzed. Briefly, all variants used as examples in this manuscript came from running the matched TCGA exome files (to which the RNA-Seq data corresponds) through SomaticSniper27 and Strelka28 to call somatic mutations, followed by the Shannon Human Splicing Pipeline10 to find splicing mutations, which served as the input to Veridical. Details of the RNA-Seq data can be found within the supplementary methods of the TCGA paper26. Accordingly, the following examples demonstrate the utility of Veridical to identify potentially pathogenic mutations from a much larger subset of predicted variants.

Leaky Mutations

Mutations that reduce, but not abolish, the spliceosome’s ability to recognize the intron/exon boundary are termed leaky3. This can lead to the mis-splicing (intron inclusion and/or exon skipping) of many but not all transcripts. An example, provided in Figure 4, displays a predicted leaky mutation (chr5:162905690G>T) in the HMMR gene in which both junction-spanning exon skipping (p < 0.01) and read-abundance-based intron inclusion (p = 0.04) are observed. We predict this mutation to be leaky because its final Ri exceeds 1.6 bits — the minimal individual information required to recognize a splice site and produce correctly spliced mRNA4. Indeed, the natural site, while weakened by 2.16 bits, remains strong — 10.67 bits. This prediction is validated by the variant-containing sample’s RNA-Seq data (Figure 4), in which both exon skipping (5 reads) and intron inclusion (14 reads, 12 of which are shown, versus an average of 4.051 such reads per control sample) are observed, along with 70 reads portraying wild-type splicing. Only a single normally spliced read contains the GT mutation. These results are consistent with an imbalance of expression of the two alleles, as expected for a leaky variant. Figure 5 shows that for the distribution of read-abundance-based intron inclusion is marginally statistically significant (p = 0.04).

309c02b8-8635-4585-be73-ec5afb2aa435_figure4.gif

Figure 4.

IGV images depicting a predicted leaky mutation (chr5:162905690G>T) within the natural acceptor site of exon 12 (162905689–162905806) of HMMR. This gene has four transcript variants and the given exon number pertains to isoforms a and b (reference sequences NM_001142556 and NM_012484). RNA-Seq reads are shown in the centre panel. The bottom blue track depicts RefSeq genes, wherein each blue rectangle denotes an exon and blue connecting lines denote introns. In the middle panel, each rectangle (grey by default) denotes an aligned read, while thin lines are segments of reads split across exons. Red and blue coloured rectangles in the middle panel denote aligned reads of inserts that are larger or smaller than expected, respectively. Reads are highlighted by their splicing consequence, as follows: cryptic splicing (green), exon skipping (purple), junction-spanning intron inclusion (dark green), and read-abundance intron inclusion (cyan). (A) depicts a genomic region of chromosome 5: 162902054–162909787. The variant occurs in the middle exon. Intron inclusion can be seen in this image, represented by the reads between the first and middle exon (since the direction is left, as described within Table 1). These 14 reads are read-abundance-based, since they do not span the intron-exon junction. (B) depicts a closer view of the region shown in (A) — 162905660–162905719. The dotted vertical black lines are centred upon the first base of the variant-containing exon. The thin lines in the middle panel that span the entire exon fragment are evidence of exon skipping. These 5 reads are split across the exon before and after the variant-containing exon, as seen in (A).

309c02b8-8635-4585-be73-ec5afb2aa435_figure5.gif

Figure 5.

Histogram of read-abundance-based intron inclusion with embedded Q-Q plots of the predicted leaky mutation (chr5:162905690G>T) within HMMR, as shown in Figure 4. The arrowhead denotes the number of reads (14 in this case) in the variant-containing file, which is more than observed in the control samples (p = 0.04).

Inactivating Mutations

Variants that inactivate splice sites have negative final Ri values3 with only rare exceptions4, indicating that splice site recognition is essentially abolished in these cases. We present the analysis of two inactivating mutations within the PTEN and TMTC2 genes from different tumour exomes, namely: chr10:89711873A>G and chr12:83359523G>A, respectively. The PTEN variant displays junction-spanning exon skipping events (p < 0.01), while the TMTC2 gene portrays both junction-spanning and read-abundance-based intron inclusion (both splicing consequences with p < 0.01). In addition, all intron inclusion reads in the experimental sample contain the mutation itself, while only one such read exists across all control samples analyzed (p < 0.01). The PTEN variant contains numerous exon skipping reads (32 versus an average of 2.466 such reads per control sample). The TMTC2 variant contains many junction-spanning intron inclusion reads with the GA mutation (all of its junction-spanning intron inclusion reads: 22 versus an average of 0.002 such reads per control sample). IGV screenshots for these variants are provided within Figure 6. This figure also shows an example of junction-spanning cryptic splice site activated by the mutation (chr1:985377C>T) within the AGRN gene. The concordance between the splicing outcomes generated by these mutations and the Veridical results indicates that the proposed method detects both mutations that inactivate splice sites and cryptic splice site activation.

309c02b8-8635-4585-be73-ec5afb2aa435_figure6.gif

Figure 6.

(A) depicts an inactivating mutation (chr10:89711873A>G) within the natural acceptor site of exon 6 (89711874–89712016) of PTEN. The dotted vertical black line denotes the location of the relevant splice site. The region displayed is 89711004–89712744 on chromosome 10. Many of the 32 exon skipping reads are evident, typified by the thin lines in the middle panel that span the entire exon. There is also a substantial amount of read-abundance-based intron inclusion, shown by the reads to the left of the dotted vertical line. Exon skipping was statistically significant (p < 0.01), while read-abundance-based intron inclusion was not (p = 0.53). Panels (B) and (C) depict an inactivating mutation (chr12:83359523G>A) within the natural donor site of exon 6 (83359338–83359523) of TMTC2. (B) depicts a closer view (83359501–83359544) of the region shown in (C) and only shows exon 6. Some of the 22 junction-spanning intron inclusion reads can be seen. In this case, all of these reads contain the mutation, shown by the green adenine base in each read, between the two vertical dotted lines. (C) depicts a genomic region of chromosome 12: 83359221–83360885, TMTC2 exons 6–7. The variant occurs in the left exon. 65 read-abundance-based intron inclusion can be seen in this image, represented by the reads between the two exons. Panel (D) depicts a mutation (chr1:985377C>T) causing a cryptic donor to be activated within exon 27 (the second from left, 985282–985417) of AGRN. The region displayed is 984876–985876 on chromosome 1 (exons 26–29 are visible). Some of the 34 cryptic (junction-spanning) reads are portrayed. The dotted black vertical line denotes the cryptic splice site, at which cryptic reads end. The read-abundance-based intron inclusion, of which two reads are visible, was not statistically significant (p = 0.68). Refer to the caption of Figure 4 for IGV graphical element descriptions.

Cryptic Mutations

Recurrent genetic mutations in some oncogenes have been reported among tumours within the same, or different, tissues of origin. Common recurrent mutations present in multiple abnormal samples are recognized by Veridical. This avoids including a variant-containing sample among the control group, and outputs the results of all of the variant-containing samples. A relevant example is shown in Figure 7. The mutation (chr1:46726876G>T) causes activation of a cryptic splice site within RAD54L in multiple tumours. Upon computation of the p-values for each of the variant-containing tumours, relative to all non-variant containing tumours and normal controls, not all variant-containing tumours displayed splicing abnormalities at statistically significant levels. Of the six variant-containing tumours, two had significant levels of junction-spanning intron inclusion, and one showed statistically significant read-abundance-based intron inclusion. Details for all of the aforementioned variants, including a summary of read counts pertaining to each relevant splicing consequence, for experimental versus control samples, are provided in Table 2.

309c02b8-8635-4585-be73-ec5afb2aa435_figure7.gif

Figure 7.

IGV images and their corresponding histograms with embedded Q-Q plots depicting all six variant-containing files with a mutation (chr1:46726876G>T) which, in some cases, causes a cryptic donor to be activated within the intron between exons 7 and 8 of RAD54L. This results in the extension of the downstream natural donor (the 5′ end of exon 8). This gene has two transcript variants and the given exon numbers pertain to isoform a (reference sequence NM_003579). Only samples IV and V have statistically significant intron inclusion relative to controls. read-abundance-based intron inclusion can be seen in (A), between the two exons. The region displayed is on chromosome 1: 46726639–46726976. (B) depicts the corresponding histogram for the 15 read-abundance-based intron inclusion reads (p = 0.05) that are present in sample IV. The intron-exon boundary on the right is the downstream natural donor. (C) typifies some of the 13 junction-spanning intron inclusion reads that are a direct result of the intronic cryptic site’s activation. In these instances, reads extending past the intron-exon boundary are being spliced at the cryptic site, instead of the natural donor. In particular, samples IV and V both have a statistically significant numbers of such reads, 7 (p = 0.01) and 5 (p = 0.04), respectively. This is further typified by the corresponding histogram in (D). (C) focuses upon exon 8 from (A) and displays the genomic positions 46726908–46726957. Refer to the caption of Figure 4 for IGV graphical element descriptions. In the histograms, arrowheads denote numbers of reads in the variant-containing files. The bottom of the plots provide p-values for each respective arrowhead. Statistically significant p-values and their corresponding arrowheads are denoted in red.

Table 2. Examples of variants validated by Veridical and their selected read types.

Header abbreviations Chr, Cv, Cs, #, SC, and ET, denote chromosome, variant coordinate, splice site coordinate, sample number (where applicable), splicing consequence, and evidence type, respectively. Headers containing R with some subscript denote numbers of validated reads for the specified variant’s splicing consequence(s) and evidence type(s). RE denotes reads within variant-containing tumour samples. RT and RN denote control samples, for tumours and normal cells, respectively. Rμ is the per sample mean of RT and RN. Splicing consequences: CS denotes cryptic splicing, ES denotes exon skipping, and II denotes intron inclusion. Evidence types: JS denotes junction-spanning and RA denotes read-abundance.

GeneChr Cv Cs VariantTypeInitial
Ri
Final
Ri
ΔRi #SCETp-value RE RT RN Rμ Figure
HMMR chr5162905690162905689G/TLeaky12.8310.67-2.16ESJS<  0.0151100.020 4, 5
IIRA0.041421331034.051
PTEN chr108971187389711874A/GInactivating12.09-2.62-14.71ESJS<  0.01329753862.466 6(A)
TMTC2 chr128335952383359524G/AInactivating1.74-1.27-3.01IIJS<  0.012222413834.754 6(B)
IIJSwM< 0.0122010.002
IIRA< 0.01657293139515.739 6(C)
AGRN chr1985377985376C/TCryptic-2.244.797.03CSJS<  0.013497230.217 6(D)
RAD54L chr14672687646726895G/TCryptic13.414.841.44IIIJSN/A0645581.274 7
IIRA0.54321712904.458
IIIIJS0.511645581.274
IIRA0.33621712904.458
IIIIIJSN/A0645581.274
IIRA0.33621712904.458
IVIIJS0.017645581.274
IIRA0.051521712904.458
VIIJS0.045645581.274
IIRAN/A021712904.458
VIIIJSN/A0645581.274
IIRAN/A021712904.458

Performance

The performance of the software is affected by the number of predicted splicing mutations, the number of abnormal samples containing mutations and control samples and the corresponding RNA-Seq data for each type of sample. Veridical has the ability to analyze approximately 3000 variants in approximately 4 hours, assuming an input of 100 BAM files of RNA-Seq data. The relationship between time and numbers of BAM files and variants are plotted in Figure 8 for a 2.27 GHz processor. Veridical uses memory in linear proportion to the number and size of the input BAM files. In our tests, using RNA-Seq BAM files with an average size of approximately 6 GB, Veridical used approximately 0.7 GB for ten files to 1 GB for 100 files.

309c02b8-8635-4585-be73-ec5afb2aa435_figure8.gif

Figure 8.

Profiling data for Veridical runtime. Tests were conducted upon an Intel Xeon @ 2.27 GHz. Visualizations were generated with R22 using Lattice30 and Effects31. A surface plot of time vs. numbers of BAM files and variants is provided in (A). Effect plots are given in (B) and demonstrate the effects of the numbers of BAM files and variants upon runtime. The effect plots were generated using a linear regression model (R2 = 0.7525).

Discussion

We have implemented Veridical, a software program that automates confirmation of mRNA splicing mutations by comparing sequence read-mapped expression data from samples containing variants that are predicted to cause defective splicing with control samples lacking these mutations. The program objectively evaluates each mutation with statistical tests that determine the likelihood of and exclude normal splicing. To our knowledge, no other software currently validates splicing mutations with RNA-Seq data on a genome-wide scale, although many applications can accurately detect conventional alternative splice isoforms (i.e.29). Veridical is intended for use with large data sets derived from many samples, each containing several hundred variants that have been previously prioritized as likely splicing mutations, regardless of how the candidate mutations are selected. It is not practical to analyze all variants present in an exome or genome, rather only a filtered subset, due to the extensive computations required for statistical validation. As such, Veridical is a key component of an end-to-end, hypothesis-based, splicing mutation analysis framework that also includes the Shannon splicing mutation pipeline10 and the Automated Splice Site Analysis and Exon Definition server5. There is a trade-off between lengthy run-times and statistical robustness of Veridical, especially when there are either a large number of variants or a large number of RNA-Seq files. As with most statistical methods, those employed here are not amenable to small sample sets, but become quite powerful when a large number of controls are employed. In order to ensure that mutations can be validated, we recommend an excess of control transcriptome data relative to those from samples containing mutations (> 5 : 1), guided by the power analysis described in Methods. We do not recommend the use of a single nor a few control samples to corroborate a putative mutation. Not surprisingly, we have found that junction-spanning reads have the greatest value for corroborating cryptic splicing and exon skipping. Even a single such read is almost always sufficient to merit the validation of a variant, provided that sufficient control samples are used. For intron inclusion, both junction-spanning and read-abundance-based reads are useful and a variant can readily be validated with either, provided that the variant-containing experimental sample(s) show a statistically significant increase in the presence of either form of intron inclusion corroborating reads.

Veridical is able to automatically process variants from multiple different experimental samples, and can group the variant information if any given mutation is present in more than one sample. The use of a large sample size allows for robust statistical analyses to be performed, which aid significantly in the interpretation of results. The main utility of Veridical is to filter through large data sets of predicted splicing mutations to prioritize the variants. This helps to predict which variants will have a deleterious effect upon the protein product. Veridical is able to avoid reporting splicing changes that are naturally occurring through checking all variant-containing and non-containing control samples for the predicted splicing consequence. In addition, running multiple tumour samples at once allows for manual inspection to discover samples that contained the alternative splicing pattern, and consequently, permits the identification of DNA mutations in the same location which went undetected during genome sequencing.

The statistical power of Veridical is dependent upon the quality of the RNA-Seq data used to validate putative variants. In particular, a lack of sufficient coverage at a particular locus will cause Veridical to be unable to report any significant results. A coverage of at least 20 reads should be sufficient. This estimate is based upon alternative splicing analyses in which this threshold was found to imply concordance with microarray and RT-PCR measurements3235. There are many potential legitimate reasons why a mutation may not be validated: (a) A lack of gene expression in the variant containing tumour sample, (b) nonsense-mediated decay may result in a loss of expression of the entire transcript, (c) the gene itself may have multiple paralogs and reads may not be unambiguously mapped, (d) other non-splicing mutations could account for a loss of expression, and (e) confounding natural alternative splicing isoforms may result in a loss of statistical significance during read mapping of the control samples. The prevalence of loci with insufficient data is dependent upon the coverage of the sequencing technology used. As sequencing technologies improve, the proportion of validated mutations is expected to increase. Such an increase would mirror that observed for the prevalence of alternative splicing events36. In addition, mutated splicing factors can disrupt splicing fidelity and exon definition37. This effect could decrease Veridical’s ability to validate splicing mutations affected by a disruption of the definition of the pertinent exon. Veridical does not currently form any equivalence between distinct variants affecting the same splice site. Such variants will be analyzed independently. Veridical is intended to be used with RNA-Seq data that not only corresponds to matched DNA-Seq data, but also only for sets of samples with comparable sequencing protocols, since the non-normalized comparisons performed rely upon the evening out of batch effects, due to a substantial number of control samples. It is important to note that acceptance of the null hypothesis, due to an absence of evidence required to disprove it, does not imply that the underlying prediction of a mutation at a particular locus is incorrect, but merely that the current empirical methods employed were insufficient to corroborate it.

“Validate,” in the present context, refers to the condition where sufficient statistical evidence has been marshaled in support of a variant. However, the threshold for significance can vary so these analyses can also be thought of as strongly corroborating variants. Recent studies in Bayesian statistics have suggested that a p-value threshold of 0.05 does not correspond to strong support of the alternative hypothesis. Accordingly, Johnson38 recommends the use of tests at the 0.005 or 0.001 level of significance.

We consider alternative splicing to be a different problem. Veridical does not aim to identify putatively pathogenic variants, but rather, to confirm existing in silico predictions thereof. We do infer exon skipping events (i.e. alternative splicing) de novo, but only to catalog dysregulated splicing “phenotypes” due to genomic sequence variants. This is not the first study to use a large control dataset. Indeed the Variant Annotation, Analysis & Search Tool (VAAST)39 does this to search for disease-causing (non-splicing) variants and the Multivariate Analysis of Transcript Splicing (MATS)29 tool (among others) can be used for the discovery of alternative splicing events. However, in our case, in most instances the distribution of reads in a single sample is compared to the distributions of reads in the control set, as opposed to a likelihood framework-based approach. We are suggesting that our approach be coupled to existing approaches to act as an a posteriori, hypothesis-driven, check on the veridicality of specific variants.

While there is considerable prior evidence for splicing mutations that alter natural and cryptic splice site recognition, we were somewhat surprised at the apparent high frequency of statistically significant intron inclusion revealed by Veridical. In fact, evidence indicates that a significant portion of the genome is transcribed36, and it is estimated that 95% of known genes are alternatively spliced32. Defective mRNA splicing can lead to multiple alternative transcripts including those with retained introns, cassette exons, alternate promoters/terminators, extended or truncated exons, and reduced exons40. In breast cancer, exon skipping and intron retention were observed to be the most common form of alternative splicing in triple negative, non-triple negative, and HER2 positive breast cancer41. In normal tissue, intron retention and exon skipping has been predicted to affect 2572 exons in 2127 genes and 50 633 exons in 12 797 genes, respectively42. In addition, previous studies suggest that the order of intron removal can influence the final mRNA transcript composition of exons and introns43. Intron inclusion observed in normal tissue may result from those introns that are removed from the transcript at the end of mRNA splicing. Given that these splicing events are relatively common in normal tissues, it becomes all the more important to distinguish expression patterns that are clearly due to the effects of splicing mutations — one of the guiding principles of the Veridical method.

Veridical is an important analytical resource for unsupervised, thorough validation of splicing mutations through the use of companion RNA-Seq data from the same samples. The approach will be broadly applicable for many types of genetic abnormalities, and should reveal numerous, previously unrecognized, mRNA splicing mutations in exome and complete genome sequences.

Data availability

figshare: Input, output, and explanatory files for Veridical, http://dx.doi.org/10.6084/m9.figshare.89497144.

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 13 Jan 2014
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Viner C, Dorman SN, Shirley BC and Rogan PK. Validation of predicted mRNA splicing mutations using high-throughput transcriptome data [version 2; peer review: 4 approved] F1000Research 2014, 3:8 (https://doi.org/10.12688/f1000research.3-8.v2)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 2
VERSION 2
PUBLISHED 07 Apr 2014
Revised
Views
40
Cite
Reviewer Report 19 May 2014
Francesc Xavier Roca, Division of Molecular Genetics and Cell Biology, Nanyang Technological University, Singapore, Singapore 
Approved
VIEWS 40
All ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Roca FX. Reviewer Report For: Validation of predicted mRNA splicing mutations using high-throughput transcriptome data [version 2; peer review: 4 approved]. F1000Research 2014, 3:8 (https://doi.org/10.5256/f1000research.4148.r4404)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
48
Cite
Reviewer Report 09 May 2014
Liliana D. Florea, McKusick-Nathans Institute of Human Genetics, Johns Hopkins University, Baltimore, MD, USA 
Approved
VIEWS 48
My comments have been adequately addressed, either ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Florea LD. Reviewer Report For: Validation of predicted mRNA splicing mutations using high-throughput transcriptome data [version 2; peer review: 4 approved]. F1000Research 2014, 3:8 (https://doi.org/10.5256/f1000research.4148.r4407)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
53
Cite
Reviewer Report 08 Apr 2014
Stefania Bortoluzzi, Department of Biology, University of Padova, Padova, Italy 
Approved
VIEWS 53
The revised version of the manuscript is improved, ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Bortoluzzi S. Reviewer Report For: Validation of predicted mRNA splicing mutations using high-throughput transcriptome data [version 2; peer review: 4 approved]. F1000Research 2014, 3:8 (https://doi.org/10.5256/f1000research.4148.r4405)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 1
VERSION 1
PUBLISHED 13 Jan 2014
Views
66
Cite
Reviewer Report 18 Mar 2014
Peter N. Robinson, Institute for Medical Genetics, Universitätsklinikum Charité, Berlin, Germany 
Approved
VIEWS 66
This paper from the Rogan group presents a methodology for validation of DNA sequencing variants that alter mRNA splicing. While variants of the most conserved splice site nucleotides at the intron-exon boundary can be predicted to cause splice defects with ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Robinson PN. Reviewer Report For: Validation of predicted mRNA splicing mutations using high-throughput transcriptome data [version 2; peer review: 4 approved]. F1000Research 2014, 3:8 (https://doi.org/10.5256/f1000research.3444.r3121)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 27 Mar 2014
    Peter K Rogan, Schulich School of Medicine and Dentistry, University of Western Ontario, Canada
    27 Mar 2014
    Author Response
    We really appreciate your constructive comments. In particular, we have revised figure 1 as you suggested. Furthermore, we have endeavored to clarify the methods by referencing every corresponding equation pertaining ... Continue reading
  • Reviewer Response (F1000Research Advisory Board Member) 14 Aug 2015
    Peter Robinson, Institute for Medical Genetics, Universitätsklinikum Charité, Berlin, Germany
    14 Aug 2015
    Reviewer Response F1000Research Advisory Board Member
    Thanks for the detailed response to my comments.
    Competing Interests: No competing interests were disclosed.
COMMENTS ON THIS REPORT
  • Author Response 27 Mar 2014
    Peter K Rogan, Schulich School of Medicine and Dentistry, University of Western Ontario, Canada
    27 Mar 2014
    Author Response
    We really appreciate your constructive comments. In particular, we have revised figure 1 as you suggested. Furthermore, we have endeavored to clarify the methods by referencing every corresponding equation pertaining ... Continue reading
  • Reviewer Response (F1000Research Advisory Board Member) 14 Aug 2015
    Peter Robinson, Institute for Medical Genetics, Universitätsklinikum Charité, Berlin, Germany
    14 Aug 2015
    Reviewer Response F1000Research Advisory Board Member
    Thanks for the detailed response to my comments.
    Competing Interests: No competing interests were disclosed.
Views
100
Cite
Reviewer Report 07 Mar 2014
Liliana D. Florea, McKusick-Nathans Institute of Human Genetics, Johns Hopkins University, Baltimore, MD, USA 
Approved with Reservations
VIEWS 100
The authors describe a method and the associated software, Veridical, for assessing the effects on pre-mRNA splicing of predicted splicing-affecting mutations. To do so the program compares splicing effects, measured by the supporting read counts, in variant-containing (disease) samples against a ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Florea LD. Reviewer Report For: Validation of predicted mRNA splicing mutations using high-throughput transcriptome data [version 2; peer review: 4 approved]. F1000Research 2014, 3:8 (https://doi.org/10.5256/f1000research.3444.r3524)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 27 Mar 2014
    Peter K Rogan, Schulich School of Medicine and Dentistry, University of Western Ontario, Canada
    27 Mar 2014
    Author Response
    We greatly appreciate your remarks and useful suggestions. All references to figures pertain to the first version of the manuscript.
     
    First, we address your comments regarding VAAST and de novo exome ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 27 Mar 2014
    Peter K Rogan, Schulich School of Medicine and Dentistry, University of Western Ontario, Canada
    27 Mar 2014
    Author Response
    We greatly appreciate your remarks and useful suggestions. All references to figures pertain to the first version of the manuscript.
     
    First, we address your comments regarding VAAST and de novo exome ... Continue reading
Views
63
Cite
Reviewer Report 28 Jan 2014
Francesc Xavier Roca, Division of Molecular Genetics and Cell Biology, Nanyang Technological University, Singapore, Singapore 
Approved
VIEWS 63
This manuscript describes a new computational tool named Veridical, which detects mutant-allele specific splicing changes from large RNAseq datasets. This outstanding tool appears very useful to screen the wealth of transcriptomic data for effects in splicing due to mutations in ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Roca FX. Reviewer Report For: Validation of predicted mRNA splicing mutations using high-throughput transcriptome data [version 2; peer review: 4 approved]. F1000Research 2014, 3:8 (https://doi.org/10.5256/f1000research.3444.r3119)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 27 Mar 2014
    Peter K Rogan, Schulich School of Medicine and Dentistry, University of Western Ontario, Canada
    27 Mar 2014
    Author Response
    We greatly appreciate your review of this paper.
     
    Regarding our use of the term “validation”, we understand that in silico validation is not comparable to in vitro or in vivo validation ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 27 Mar 2014
    Peter K Rogan, Schulich School of Medicine and Dentistry, University of Western Ontario, Canada
    27 Mar 2014
    Author Response
    We greatly appreciate your review of this paper.
     
    Regarding our use of the term “validation”, we understand that in silico validation is not comparable to in vitro or in vivo validation ... Continue reading
Views
92
Cite
Reviewer Report 27 Jan 2014
Stefania Bortoluzzi, Department of Biology, University of Padova, Padova, Italy 
Approved with Reservations
VIEWS 92
The paper “Validation of predicted mRNA splicing mutations using high-throughput transcriptome data” by Viner et al. presents Veridical, a new software for the interpretation and validation of genetic variants identified by DNA sequencing that alter mRNA splicing, leveraging RNA-seq data. ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Bortoluzzi S. Reviewer Report For: Validation of predicted mRNA splicing mutations using high-throughput transcriptome data [version 2; peer review: 4 approved]. F1000Research 2014, 3:8 (https://doi.org/10.5256/f1000research.3444.r3120)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 27 Mar 2014
    Peter K Rogan, Schulich School of Medicine and Dentistry, University of Western Ontario, Canada
    27 Mar 2014
    Author Response
    Thank you for your constructive comments. Regarding the R program, it is embedded within a Perl script to facilitate expeditious parameter and data parsing. All of our figures were generated ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 27 Mar 2014
    Peter K Rogan, Schulich School of Medicine and Dentistry, University of Western Ontario, Canada
    27 Mar 2014
    Author Response
    Thank you for your constructive comments. Regarding the R program, it is embedded within a Perl script to facilitate expeditious parameter and data parsing. All of our figures were generated ... Continue reading

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 13 Jan 2014
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.