Challenges confronting precision medicine in the context of inherited retinal disorders

ABSTRACT “Tonight, I’m launching a new Precision Medicine Initiative to bring us closer to curing diseases like cancer and diabetes — and to give all of us access to the personalized information we need to keep ourselves and our families healthier.” - President Barack Obama, State of the Union Address, January 20, 2015. This new initiation of precision medicine has generated excitement and promises of improved health outcomes. The challenge is to use individual-specific information to provide optimal, cost-effective customized care. This entails a comprehensive knowledge of individual behaviors, diet and exposures as well as a complete past medical history of an individual and detailed (preferably quantitative) descriptions of clinical findings and measurements. The molecular genetics of an individual as well as that of diseased tissues such as cancers are an integral part of this program. Molecular diagnostics are especially important for individuals who are experiencing disorders that are clearly traceable to genetic variations. However, to realize the potential of combining genetics with non-genetic factors that contribute to disease, we will need to have a much better understanding of how genetic variants contribute to normal and pathologic physiology, and how to classify these genetic variants, which are often uncharacterized. In this paper we will consider the promise and challenges of molecular diagnostic testing for rare genetic disorders, with a focus on inherited retinal disorders (IRD), and leave the consideration of identifying, quantifying, and assessing non-genetic risk factors for disease to others.


The challenges
There are two basic situations in which we seek to make a molecular genetic diagnosis for an individual. The first instance is when a person presents with clinical symptoms and/or findings that suggest a genetic etiology. The subtler or mild the clinical findings, the more difficult it can be to narrow the genetic search. On the far extreme, with advanced disease, one may have loss of clinical features or such severe pathology that again a selective genetic search is challenging. In these cases, one is often confronted with testing a panel of genes or a full exome analysis and having to interpret the multitude of observed variants as to whether or not they account for the observed disease. As a corollary of this first situation, we have patients with malignancies that have arisen in association with one or multiple somatic genetic variations and one is seeking a molecular characterization of the lesions for both diagnostic and therapeutic purposes.
In the second situation, genetic testing is being sought to identify an individual who is at risk of having a condition. This is commonly the motivation for prenatal testing but also can be relevant in children and adults who have affected family members or again very early symptoms. This type of testing is becoming increasingly important as we identify genetic variants that cause substantial risk for adult-onset disease and cancers. One again faces the challenge of interpretation of the multitude of genetic variants but with the added uncertainties of unknown penetrance, genetic and gene-environment interactions, and genetic variants of unknown significance (VUS).
The challenges of identifying somatic genetic variants that characterize a malignant tumor are very different from determining if one or more germ line genetic variants will ultimately lead to a disease state. The recent successes in tumor-specific therapies have illustrated the potential of precision medicine but also its current limitations. [1] Both the genetics of tumors and Mendelian disorders start with characterizing tissue or individuals (and families) that reflect a diseased state. For tumors, we have to consider the genetic and functional heterogeneity within the lesion. Traditional Sanger sequencing has only limited sensitivity in detecting genetic variants that are present in a relatively small proportion of the tumor cells. However next generation sequencing (NGS) methods when done with sufficient coverage can reliably detect variants that are present in 0.01% or more of a mixed cell population. [2] When fine needle biopsies are employed, one still has to worry that the sampling has not been representative of the genetic diversity within the entire tumor (as well as the genetic changes in those cells that may have already metastasized). For Mendelian disorders, we have to address the unknown penetrance and expression of genetic variants that have not been studied in unaffected individuals. If we detect a disease-causing variant in a gene responsible for autosomal dominant juvenile-or adult-onset disease in a fetus, can we be confident that the child will actually develop the disease? Can we predict the age of onset and severity of that condition based solely on the variants that are in the causative genes? Even with our improved knowledge of mutation load (genetic burden; see The Wild West of Variation Callings) as a potential disease mechanism and large population allele frequency data, it remains a challenge to confirm causality. Limited genetic testing can also be problematic and lead to mistaken attribution of cause to a putative variant in a candidate gene. When a dominant mutation is identified after sequencing a single candidate gene for a condition that is likely to be genetically heterogeneous, are we really confident that the case has been'genetically resolved'? Frequently additional potentially diseasecausing variants can be identified if more genes are sequenced. The classic concepts of incomplete penetrance require an ongoing reevaluation in light of a growing number of genetic variants that are disease modifying and/or epistatic. The genetics of Mendelian and complex genetic disorders have to consider, not only the DNA variants that are causal (direct and/or probabilistic) for the disease, but those variants that can affect the host responses to those perturbations (disease modifiers) over a lifetime.

Inherited retinal dystrophies (IRD) as a model for precision medicine
The clinicians and scientists committed to the field of inherited retinal dystrophies (IRD) are in a unique position because these conditions are readily identified and described. These conditions have been the subject of numerous clinical and research descriptions since the invention and implementation of the ophthalmoscope after 1851. [3] Retinitis pigmentosa (RP) was first defined by Donders in 1857 and the first causative gene in humans was identified as rhodopsin in 1990. [4] Since that time, there have been thousands of articles describing both diverse phenotypes and numerous causative mutations (see RetNet [5]). IRD can offer a great case study of the promises and limitations of what molecular diagnostics for precision medicine can be applied to these patients. Recently, the molecular diagnosis of IRD has been reviewed extensively. [6][7][8] Inherited retinal disorders provide a microcosm of the challenges that confront precision medicine. While most IRD are Mendelian disorders and relatively rare, collectively they constitute a major healthcare burden because of the longevity of individuals who are either born blind or progressively lose their vision in their teenage and young adult years. As a group, IRD are among the most genetically heterogeneous set of conditions and molecular genetics research in both humans and animals has identified more than 250 genes that can be responsible for this group of disorders. [5] In addition, IRD conditions are at the forefront of genetics-based therapies, including small molecule, gene replacement, and stem cell approaches. Gene replacement therapy for RPE65-based Leber Congenital Amaurosis (LCA) has been successful for sustained improvement of visual function. [9] Clinical trials are beginning or underway for Choroideremia, Stargardt Disease, Achromatopsia, the LRAT-based LCA. Small molecular therapies are beginning for Stargardt Disease and stem cell programs for RP and Stargardt Disease are also underway. The promise of disease-specific therapies for these disorders has made the drive for better diagnostic tools even more pressing.
Molecular diagnostics, particularly with NGS methods, continues to identify new causative genes and mutations as well as a growing appreciation that the phenotypes of mutations for a given gene can often be far more complicated than we previously realized. In fact, there is really no consensus of how many genes are involved with IRD. There is a broad range of mechanisms by which mutations in these genes can give rise to IRD. Some genes causing IRD are specific to cell types in the retina and pigment epithelium, while others are constitutively expressed, but play critical roles in the highly specialized cells and structures in the eye. There are examples of autosomal dominant and autosomal recessive RP caused by mutations in the same gene, such as RP1. [10] There are cases in which mutations in the same gene can give rise to nonsyndromic and syndromic RP, such as mutations in USH2A, BBS1, IFT140, and IFT172. [11][12][13][14] There are examples of individuals within the same family exhibiting a range of phenotypes, even with the same mutations in a gene (see for example, PRPH2. [15][16][17] Finally, many genes with known mutations that give rise to syndromic forms of IRD, when included in IRD genetic testing panels, are now being reported to have mutations that can give rise to a broader range of phenotypes than were previously described. In some cases, clinicians have identified nonsyndromic disease caused by some of these genetic variants. Of course with an expanding set of genes that are sequenced, one finds an increasing number of variations in these genes that may or may not have any clinical relevance. Since we have yet to fully understand the variability of phenotypes associated with many disease-causing genetic variants, we need to maintain sufficient flexibility to recognize new patterns. For example, it has been recognized that mutations in RDH5, SAG, and PDE6B that were thought to cause forms of congenital stationary night blindness (a nonprogressive form of dysfunction of the rod photoreceptors) can lead to progressive vision loss in some individuals. [18][19][20] Though mutations in a wide variety of genes can give rise to overlapping and clinically indistinguishable forms of retinal degeneration, there are some forms of IRD that do have distinctive clinical findings that can aid to the narrowing of the search of the causative gene; for example, preserved paraarteriolar retinopathy with CRB1 mutations and the crystalline deposits seen with Bietti crystalline retinopathy (CYP4V2); but these are relatively rare. From molecular genetic studies, we now appreciate that the phenotypes associated with these variants in these particular genes are more diverse than their classical definitions. By testing a large number of genes known to cause retinal disorders, there have been numerous instances in which genetic mutations have been attributable to novel disease manifestations. [21][22][23] However, all of this genetic testing for numerous genes comes at a price as we acquire more information that is complicated to interpret and that challenges our ability to provide definitive answers to our patients.
While our understanding of the genetics of IRD is unparalleled compared to other groups of Mendelian disorders, our abilities to predict the phenotype, age of onset, severity, and rate of progression of all of these conditions are still very limited. Thus IRD as a group offers immediate opportunities for precision medicine, as well as a significant set of challenges for its expansion. The diagnoses of IRDs are complicated by the existence of acquired conditions, including toxic and drugrelated retinopathies and autoimmune and paraneoplastic retinopathies, that can be nearly virtually indistinguishable from some forms of IRD. For these conditions, a Mendelian genetic influence is unlikely, though many have speculated that the complex genetics of drug metabolism and toxicity as well as the genetics that underlie an autoimmune predisposition may play major roles. [24,25] However, for these acquired conditions, one has the obligation and opportunity to intervene in the disease progression by alternative approaches. The possibility of any individual losing sight from one of these acquired retinopathies places limits on the diagnostic capabilities of IRD genetic testing and complicates our interpretation of 'negative' molecular diagnostic results. Such issues remind us that precision medicine will entail more than just molecular diagnostic testing and that the patient history and exposures still play a critical role in diagnosis and management.
IRD molecular diagnostics has greatly benefited from the application of NGS into clinical diagnostics. Retinal gene panel and exome sequencing have revealed genes that can have mutations that give rise to either syndromic IRD or nonsyndromic IRD, and despite a growing number of identified genes and disease-causing variants, we still are finding a significant percentage of novel or extremely rare variants. No single gene dominates this group of disorders and only a small percentage of cases have clinical features that are distinctive for a particular causative gene as noted above. IRD is widely distributed in all populations, though founder mutations have been well described. [26][27][28][29][30][31][32][33][34][35][36][37] One example of a founder mutation is the c.1148delC variant in the CNGB3 gene [37] which is homozygous in most Caucasian patients with achromatoptia.
The most common form of IRD is RP aka nonsyndromic rodcone dystrophy. Mutations in RHO are the most common cause of autosomal dominant RP (adRP) with mutations identified in 16-26% of patients. [38,39] However, only 15-25% of all RP patients are affected by adRP and 5-15% are affected by X-linked RP. [40] However other populations have reported a broad range of distribution of different RP inheritance patterns. [41][42][43] In the Casey Eye Institute molecular diagnostic laboratory more than 50% of the samples are from countries with government funding for testing; therefore, the social economic scale is more diverse. RPGR and USH2A are probably among the most common mutated genes identified by our sequencing of~250 genes, each with mutations identified in5 % of our patient cohort. Ascertainment bias may also be a factor affecting the distribution of causative mutations for domestic samples. Insurance companies consistently resist the use of molecular genetic testing and it can be easier to secure approval for genetic testing with sporadic male RP cases because of the possibility that it may be X-linked and affect family planning differently from sporadic women RP patients who are more likely to have autosomal recessive RP. Also it might be possible that people with syndromic conditions are more willing to pursue genetic testing. However, if we divide patients based on ages, CLN3 mutations are occasionally identified in the age group~8-10. Homozygosity of the 1 kb deletion is common in Caucasian patients with mutations in CLN3. Mutations in CLN3 can cause juvenile Batten disease. Interestingly, most cases found to carry pathogenic mutations carry mutations in more than one gene and disease mechanisms can be confusing when different combinations of dominant, recessive, and X-linked genes are involved. Even though autosomal recessive, dominant, and X-linked forms of inheritance can be established by family history information, a significant portion of individuals present as sporadic cases (often recessive but in some cases representing a lack of potential family history data and/or de novo mutations). For these sporadic RP cases, one is confronted by variants in genes that can give rise to either autosomal dominant or autosomal recessive disease, such as KCNJ13, GNAT1, RHO, PROM1, PRPH2, IMPG1, RP1L1, RP1, RDH12, ABCC6, GUCY2D, AIPL1, and CRX. [5] Digenic RP has also been previously described (ROM1/PRPH2 and CDH23/PCDH15). [44,45] It has become apparent that some of the previous conclusions as to causation were drawn from limited sequencing. In the new era of NGS, a different picture is gradually emerging. Clearly, in order to improve mutation detection rate, the entire set of potentially causative genes will need to be sequenced simultaneously.
At this time, locus heterogeneity and allele heterogeneity are no longer obstacles. However, the large amount of data generated has created new challenges of variation callings and data interpretations. The lack of standardizations of variation callings and data interpretations is becoming a major obstacle of precision medicine. Equally problematic, the simplistic views of one gene/one phenotype and phenotype/genotype correlations based on studies of single genes in the past are gradually being challenged. For example, mutations in ABCA4 or PRPH2 are known to give rise to a variety of phenotypes even among members of the same family. While environmental and exposure-related factors may play some role, the presence of additional genetic variants that can modify the clinical phenotype is the most likely explanation. To make matters even more complicated, different mutations in many genes can cause more than one condition. For example, mutations in CLN3 appear to cause juvenile Batten disease or nonsyndromic RP [46]; although linkage disequilibrium should still be carefully ruled out. It is likely that some variations defined as benign in the studies of syndromes may actually be pathogenic as hypomorphic alleles when present in patients with nonsyndromic forms of IRD. [47] The severity of clinical presentations and the age of onset in a specific condition may correlate with the degree that mutations disrupt protein synthesis or function in some cases, but not in others. A quantitative threshold may be at work for at least some genetic conditions (see the example of OCA1/AROA in The Wild West of Variation Callings). Unfortunately, there is still no sure way to predict pathogenicity of variations. Such uncertainty is already a considerable challenge for patients who exhibit clinical signs of IRD even when we can ask if the genetic variants are consistent with the clinical findings. However, the problem is more intractable when genetic testing of a fetus or child who is asymptomatic yields VUS (or even known pathogenicity) and we are asked to predict if that child will develop signs of the condition, when the penetrance and expressivity are unknown. Such issues will only be resolved if there is a collective analysis of population-based genetic evaluation with appropriate clinical follow-up.

Current status of molecular diagnosis of IRD
Before we can consider how to interpret the genetic variations that we observe in every human being and in tissues such as tumors, we need to consider the methodologies for detecting those variants. With the broad implementations of NGS in the clinical market, the 'gold standard' status of Sanger sequencing is gradually being replaced. The current debate is more on clinical utility of disease-specific NGS panels vs clinical utility of whole exome sequencing (WES) or even whole genome sequencing (WGS). There are both cost and ethical issues that affect the decision to do NGS for a limited set of genes or genetic regions as opposed to more comprehensive strategies. Sequencing depth (alluded to above for tumor genetics) is also a consideration in Mendelian genetics when one is concerned about heteroplasmy of mitochondrial genetic variantsthe potential for somatic mutations, mosaicism, and uneven coverage. The choice of sequencing platforms is also becoming more diverse and can affect sequence assembly depending on the length of the sequence reads and regions of interest. Automation of sequencing and sequence analysis are clearly becoming more commonplace and helping to reduce costs but one has to be mindful when reading depth is being sacrificed for speed and cost-savings. For panel sequencing, target enrichment steps can be accomplished by hybridization/capturebased method, multiplex polymerase chain reaction (PCR), or even simplex PCR. Detections of indel mutations can be accomplished by array comparative genomic hybridization analysis although WES and especially WGS are likely to become the method of choice for the detection of both sequence and structural variations (although it is still unclear about the limitation of detecting structural variations at this time). Coverage still remains an issue because NGS will continue to have regions of omission, particularly in regions that are resistant to amplification as well as highly repetitive DNA segments that cannot be reliably assembled from short DNA sequencing reads. [48,49] This seems to be a quantifiable and tractable issue that has been reduced by better enrichment methods and specialized sequencing efforts for some critical genes. For highly repetitive regions of the genome, longer fragment sequencing methods have been employed to detect rearrangements and indels. [50] In the NGS testing of patients with IRD, RPGR ORF15 has been one of the most difficult regions to sequence. Depending upon the enrichment methods, the ORF15 region cannot be sufficiently amplified by short PCR reactions and sequence alignment and variation filtering (by eliminating false positive and nonspecific variations) requires extra cautions. In the CEI molecular diagnostics service, the combination of long range PCR and NGS has proven to be the most reliable method for sequencing RPGR ORF15 (unpublished data). Finally, filtering (by eliminating false positive, nonspecific, and benign variations) is arguably becoming the most critical step in variant calling even when the target regions can be fully covered by sequencing. If the filtering threshold is set too high, true mutations can be mistakenly eliminated and undetected. However, if threshold is set too low, false positive mutations require additional confirmation by a different method, which greatly increases costs. This is still an issue for WES and especially for WGS at this time. For this specific reason, deep sequencing of specific sets or panels of genes is still the recommended method when the clinical diagnosis is clear. Because of our growing appreciation of phenotypic and genetic heterogeneity of IRD, single gene testing is generally becoming increasingly discouraged (unless the gene and mutation are already known within a family) because a precise molecular diagnosis cannot usually be reliably achieved by selecting a single gene for analysis. Unfortunately, this realization has not been fully embraced by many health insurance carriers who repeatedly describe such gene panels as experimental and prefer to cover single gene testing (or none at all). See the BlueCross BlueShield Corporate medical policy (https://www.bcbsnc.com/assets/services/pub lic/pdfs/medicalpolicy/general_approach_to_evaluating_the_ utility_of-genetic_panels.pdf).
Even with the remaining uncertainties and variabilities at the sequencing level, it is becoming clear that the major problem facing us is no longer at the sequencing level. How to make sense of the large amount of sequencing data generated is the major challenge at this time. As we attempt to use NGS for Mendelian disorders and even for complex genetic disorders, we are finding that, instead of bringing in precision, NGS is exposing us to an unknown territory of complexity. We lack the tools and knowledge to provide meaningful interpretations of much of the genetic variation that we observe even within the context of a limited subset of human genes. We are really not ready to take advantage of the genetic information generated even when the sequencing cost is lowering to $1,000. When sequencing large number of genes in almost any individual, we almost always find variations of unknown significance in multiple genes, including those that are implicated in the condition that was the indication for the testing. This complexity no doubt has generated a huge problem for data interpretation and counseling. [51] But it also liberates us from the old school of thinking and put us on the path to unravel the true nature of disease mechanisms. Finally, this topic has been extensively covered recently. [6,7] The Wild West of variation callings Today, variation callings are still mostly conducted by the individual testing laboratory. The common tools used by molecular diagnostic laboratories include databases of reported genetic variants (including common, rare, and disease-associated) and algorithms for predicting the likelihood that a variant is disruptive of transcription, splicing, and/or protein function. The Human Gene Mutation database (HGMD), ClinVar (http://www.ncbi.nlm. nih.gov/clinvar/), Leiden Open Variant Database (LOVD; http:// www.lovd.nl/3.0/home), and ClinGen (https://www.clinicalgen ome.org/) are the most common databases used, but they are by no means complete nor up to date. [52] An open source database such as ClinVar can sometimes create unintended problems once a variation is called. Since the entries are not curated or validated, incorrect assignments and biased entries can occur, especially for genes causing multiple conditions. The complexity of variation callings within the databases can be illustrated by the Oculocutaneous albinism type 1 gene (OCA1) p.R402Q allele (rs1126809; allele frequency 0.0813; mentioned as 'with pathogenic allele' by dbSNP). Oculocutaneous albinism type 1 (OCA1) was described as an autosomal recessive condition. [53] However, a milder condition, autosomal recessive ocular albinism (AROA, a condition mainly occurs in Caucasians with mostly ocular finding but less cutaneous finding), can also have mutations in OCA1. Interestingly, almost all of the AROA patients with mutations in OCA1 possess only one mutation in OCA1 in trans to the p.R402Q allele. [54] However, the combination is not sufficient to cause AROA, suggesting additional genetic factors are necessary. In ClinVar, there are seven calling for this variation (http://www. ncbi.nlm.nih.gov/clinvar/variation/3779/ -Pathogenic: 3; risk factor: 1; association: 2; benign: 1). Based on our review of cases from the CEI Molecular Diagnostics database, AROA is most likely caused by a quantitative threshold effect with Caucasians having lower threshold (lighter pigmentation profile); therefore, the presence of a mutation in trans to the hypomorphic allele can be disease-causing when additional genetic factors are present.
The concept of quantitative threshold is likely not to be unique to AROA. In fact, when large numbers of genes are sequenced, many potential pathogenic variations can be identified. Some of these variations may interact with others and cause unique disease presentations. Some others may be present coincidentally. Dissecting the potential interactions and identifying the causative mutations is becoming a daunting task. Are we going to define positive interactions solely based on protein function, localization, and putative/reported interactions? In some cases, codon usage may have a measurable impact on protein levels and other variants may modify splicing patterns that can dramatically affect the biology of the cell. If so, we may miss unknown and unforeseen interactions especially when only rare variations are considered. On the other hand, can we just assume that every identified mutation has the potential to contribute to disease threshold? Further, the current variation calling scheme mostly relies on the concept of purely recessive, purely X-linked, or purely dominant inheritances. Each variation is interpreted individually and not in the context of accumulated mutation load, even though every Mendelian disorder causes a cascade of changes in the gene expression and protein levels of affected cells due to direct and indirect interactions. This difficulty can be illustrated by the following two cases. In case 1, potential mutations in two unrelated genes were identified. In the study of a thirteen years old male with mild hearing loss and rod cone dystrophy, p.R668C:c.2002C>T (predicted to be probably damaging by PolyPhen-2 with the highest score of 1.0) and c.5327-14T>G were identified in the MYO7A gene. Additionally, a hemizygous splice site mutation c.1309+1G>A was identified in the CACNA1F gene. USHER syndrome type 1B was excluded from clinical diagnosis because of the mild clinical presentation. In this case, we can hypothesize that the MYO7A c.5327-14T>G variation is likely to be a hypomorphic allele and that the CACNA1F c.1309+1G>A mutation may also contribute to the mutation load. Further, MYO7A c.5327-14T>G may be a benign allele in some other genetic backgrounds, but it may not be benign in this unique case. In a second example, a reported dominant mutation was identified in PITPNM3, namely, p. Q626H:c.1878G>C. Additionally, two variations were identified in TRPM1, namely, p.Q1200H:c.3600G>C (predicted to be possibly damaging by PolyPhen-2 with a score of 0.79 out of 1.0) and p.E1320K:c.3958G>A (predicted to be possibly damaging by PolyPhen-2 with a score of 0.897 out of 1.0). The two TRPM1 variations are in trans. Mutations in PITPNM3 can cause dominant cone rod dystrophy and mutations in TRPM1 can cause CSNB. The original differential diagnosis was cone rod dystrophy. However, the allele frequency of the PITPNM3 mutation is 0.002495 (http://exac.broadinstitute.org/variant/17-6371557-C-G), suggesting that it may not be acting as a purely dominant mutation. The incomplete penetrance is most likely due to the presence or absence of additional mutations/modifiers. In this case, the TRPM1 variations could be additional mutations/modifiers even though these two genes have no established connection through shared pathways. In both of these examples, we are left with a set of questions and possibilities that cannot be addressed with isolate cases. The recently published concept of genetic burden in neuropathy supported the concept mutation aggregation. On average, there are additional 2.3 damaging variations in patients with Charcot-Marie-Tooth (CMT) disease compared to 1.3 in controls. [55] How do we define these additional damaging variations and how do we confirm if these multiple variants are having a combined effect in a single affected individual? This will no doubt become one of the most difficult challenges when trying to further extend the precise molecular diagnosis of diseases such as IRD.
As noted above, the current genetic variant databases are vulnerable to prior misinterpretation of variants by other investigators. Programs that rely on population-based allele frequencies, evolutionary conservation of alleles, and predicted disruption of protein structure can provide some useful insights but clearly have shortcomings when the patient's findings are compared with the wrong population or when the allele is sufficiently rare. A variation may be predicted to be pathogenic in one population but of questionable pathogenicity in a different population depending upon allele frequencies. For example, the allele frequency for ABCA4 p. G1961E mutation is 0.004723 in European but 0.01498 in South Asian (http://exac.broadinstitute.org/variant/ 1-94473807-C-T). If that variant is in strong linkage disequilibrium with a modifier variant or haplotype in different populations based on founder effects, one can observe legitimate differences in pathogenicity among different populations. Ethnicity is becoming an important factor in variation calling; however, it can also bring up confusions. Can we really define pathogenicity by relying on allele frequency data from only one population? This is especially worrisome for the understudied minority groups. Data filtering and variation calling in these groups relies on the prior experiences with other populations who may have different haplotypes for those genes. With a few exceptions, current software is remarkably poor at determining how a change in amino acid sequence of a protein will affect its biological activity, interaction with molecular partners, localization, or secretion from the cell and its resistance to degradation. In addition, current programs fall short of identifying variants that fail to alter a protein-coding sequence but which could have clinical implications either with transfer RNA availability within the cell and its impact on synthesis rates or by affected embedded splice enhancer or regulatory sequences that may be within a coding region.
While some progress has been made, [56] these alternative methods of defining the biological implications of variants have yet to achieve common use and clinical validation. If the situation was not already a complicated mess, we have multiple examples of deep intronic and promoter region variants that have been shown to be causative of gene dysfunction (and disease) based on in vivo and in vitro studies, but which have yet to be generalized into a bioinformatics algorithm that can be suitably predictive. The recent identification of deep intronic mutations in ABCA4, [57,58] the identification of a deep intronic mutation in OFD1 for X-linked RP [59] and the final identification of the genetic cause of North Carolina macular dystrophy due to point mutations in the DNase hypersensitivity site upstream of PRDM13 and duplication of a 123-kb tandem duplication containing the PRDM13 gene [60] illustrate the limitation of mutation predictions.

The practical issues encountered in molecular diagnostics and expanding our knowledge base
The concept of genetic burden may have a practical use in the evaluation of molecular diagnostic data from patients with the clinical diagnosis of RD. As mentioned previously, the clinical presentation of IRD may overlap with presentations caused by nongenetic conditions. Since a negative result can be caused by different factors including mutations outside the sequenced regions or in unknown genes, one can never be absolutely certain that patients with negative testing results truly have genetic conditions. However, after comprehensive and deep sequencing of all the known IRD genes, the total identified pathogenic variations (genetic burden) as well as the 'as yet unknown' variants for the complex traits that may control vulnerabilities for toxic and/or autoimmune responses may be helpful to decide disease mechanisms for the 'negative cases'. Additional study is requited in order to investigate the practical value of this application.
With the ongoing discovery of rare IRD cases involving both novel mutations and previously uncategorized genes, is it reasonable for journals and their editorial staff to require that every novel mutation or newly recognized gene for IRD be proven in multiple individuals/families and/or with extensive in vivo or in vitro experimentation? Do we miss opportunities for discovery because of the limitations of our curation systems and approaches? The boundaries between research testing and clinical testing as a means of gene discovery have begun to blur as clinical panel, exome, and WGS are being employed in clinical laboratories. Some large clinical sequencing laboratories may even have sequenced more patients than research laboratories for certain specific diseases. However, these laboratories are neither equipped nor funded to undertake the types of in vitro and cell culture studies that might be essential to establish causality of specific genetic variants. The dilemma of publishing inclusion or exclusion of unproven genetic associations with clinical cases is a real concern. Or should we just play it safe and ignore the complexities of genetic mechanisms and present the genetic testing results to the clinicians and their patients as 'inconclusive' or 'negative'? Could we better document possible findings in such a way that allows others to understand the level of uncertainty of the implications of an observed association and then amend that level of uncertainty as new information becomes available from other sources?
With the arrival of several clinical trials including gene therapies, stem cells, and small molecules, the new concept of genetic burden (mutation load) will need to be incorporated into the pretreatment evaluation. Gene therapies that are directed toward correcting a single defective gene may fail because individuals may have several genetic variants that play causal and/or contributory roles in disease and limit the effectiveness of a single gene therapy. Simply restoring a missing functional gene and its enzyme in one or more cell types may not eliminate the deleterious effects of the remaining abnormal gene product in those cells such as activation of the unfolding protein response or protein mislocalization. Gene therapies based on introducing gene products that foster cell survival will heavily rely on the effects of genetic background variants to determine their effectiveness. Just as one would like to avoid complications of drug therapies by employing pharmacogenomics (a goal that has been achieved with only limited success), one would ideally want to be able to assess the potential safety and efficacy of a genetics-based intervention through an understanding of the genetic variants that control the cellular pathways that are critical for cell function and survival.
Finally, the lack of 'authoritative variation calling' invites different approaches and opportunities. Would an 'elitist approach' of limiting predictions of variant pathogenicity to a limited panel of experts provide more accurate and reliable information? Would an open system similar to the Wikipedia model provide a better means of acquiring and sharing new knowledge that would tend to be self-correcting? So far the open system ClinVar model has not worked particularly well with respect to both data entry and ongoing curation and revision. Commercial interests probably have complicated these efforts. The initial monopoly of BRCA testing by Myriad Genetics created probably the most extensive proprietary genetic data for the BRCA genes. Even now with no patent protection, the proprietary genetic data continues to differentiate Myriad Genetics from other genetic testing services for these genes. The Genome England 100,000 genomes project will probably contribute significantly to the mutation database, although the exact number of patients for each specific disease will be much less. Funding agencies such as National Institute of Health (NIH) can take the lead to standardize genetic testing for funded projects and collect genetic data from the funded projects. Unfortunately, with the democratization of genetic testing, the funding of sequencing patients becomes less favorable unless such efforts are integrated into their care process and covered by third-party payers.

The long march toward bringing precision into molecular diagnosis
Our appreciation for the diversity of phenotypes that can be caused by mutations in genes associated with diseases has been greatly expanded since the adoption of NGS testing. Unfortunately, we currently lack the sufficient standardization and completeness of various NGS testing to ensure that we can sufficiently identify and classify the genetic variants in all of the relevant genes that may potentially define the phenotype of the condition. Further limitations in our ability to recognize modifier genes and their variants are the limited number of cases that are having molecular diagnostic testing performed and the standardization of phenotypic features of retinal disorders that would enable association testing. A further limitation is that modifier genes may not be retina-nor eye-specific and they may not necessarily have variants that are causative of disease. Thus the current disease-specific panels will overlook this subset of genes. However, as the costs continue to decline, WGS will ultimately provide an alternative platform for coverage of both causative and modifier genes. The massive expansion of data from a whole genome approach will further challenge us to identify meaningful associations, even with larger cohorts. Transcriptome analyses (such as with differentiated iPS cells) and in vitro models may help us narrow those searches and provide additional evidence for functionally based associations. However even with murine models of IRD, it has also long been recognized that genetic backgrounds can affect phenotypes. [61] Therefore, our definitions of disease-causing variants are going to continue to change including the recognition that some variations may only become pathogenic when present in unique genetic backgrounds or in combination with multiple variants (mutation load). Mutation, modifier, and genetic background cannot be clearly defined as each and the definitions can be subjective.
Clearly establishing the criteria by which we can conclude that one or more genetic variants are responsible for an individual's clinical IRD is a key challenge. There are published examples for which attribution errors have been made and then propagated in the clinical literature. [62] With much improved allele frequency data generated by large-scale sequencing projects (such as ExAc Browser: http://exac.broadinstitute.org/ and NHLBI grand opportunity exon sequencing project (ESP): https://esp.gs.washington.edu/drupal/), evaluation of once rare variations is becoming much simpler. Some of the previously published dominant mutations should be reevaluated. [63] Theoretically, allele frequency of a purely dominant mutation should be less than the disease incidence. However, the issue of penetrance is most likely related to the presence or absence of additional mutations/modifiers. If true, then, what should be the acceptable cutoff for a dominant mutation? A less penetrable dominant mutation may actually have higher allele frequency because of the requirement of additional modifiers. However, the differentiation of digenic mutation mechanism from modifier effect can be difficult. Theoretically, digenic mutation is defined as the coexistence of two variations in two non-allelic genes necessary to develop a disease phenotype whereas one single variant does not lead to a phenotype by itself. However, there are only a few published examples such as ROM1 and RDS digenic mutation mechanism. In practice, it can be very difficult to define new interactions. In one of our own cases of presumed digenic RP due to ROM1 and PRPH2 mutations, there are other family members who have been diagnosed with macular dystrophies that are likely to be attributable to just the PRPH2 mutations. However, to establish this, we need more than just the clinical data and genetics of the proband but also the corresponding information from other family members. Especially in the era of NGS, are we prepared to uncover multi-allelic interactions? This is especially true when dissecting later-onset conditions. The presence of multiple pathogenic variations in different genes may lower the disease threshold and the accumulated effects of aging may finally push through the disease threshold. Further, can a mutation be defined as a digenic mutation in one case but work as a modifier in a difference case? Finally, do digenic mutations only occur to specific combination of mutations? One can envision that various mutations including dominant mutations work together and various combinations of these mutations create the hallmark of dominant inheritance: variable expression, penetrance, and age of onsets. For recessive mutations, the 0.005 cutoff may actually be too high for most mutations. Founder mutations in smaller population may distort allele frequency and thus it is relevant to know the ethnic origins of patients undergoing molecular diagnostic testing. Finally, because we do not know the degree of incomplete penetrance for many Mendelian conditions and if there are variant-specific cases of incomplete penetrance, this complicates our ability to predict the future onset of disease in a person who is tested to have disease-causing or predicted variants.
A number of investigators have cited examples of potential variants in 'normal' genes that may affect expression and hence the severity of disease when there is a mutation in the genes that can give rise to autosomal dominant disease. [64,65] Variants in cis with a disease-causing mutation may mitigate or worsen its effects. Similarly, rare and/or some common variants in other genes that are expressed in the target tissues may contribute to the severity and distribution of pathologies. Even when two variations of unknown significance are in trans, we still lack the tools and knowledge for identifying the relevance of these variants for human disease. It will take detailed and systematic phenotyping over time (to measure disease features and progression) as well as massive datasets of genetic information of many individuals to elucidate these complex relationships.
Finally, the upcoming regulations from US FDA may have an unintended effect by forcing consolidation and pushing for economy of scale. Too much regulation is costly and it can stop innovations. However, the current Wild West scenario in molecular diagnosis is not sustainable and it will not reach to its full potential by delivering precision into medicine. Will marker force ultimately correct the path by allowing innovations and self-corrections? Alternatively, will the pharma-like model (a few big testing laboratories control the market) or cell phone carrier model (free genetic testing in order to sell more drugs) prevail? We have witnessed a true revolution in molecular diagnosis in the last five years. The next five years will no doubt be more exciting, breathtaking, and groundbreaking. This is a once-in-a-lifetime opportunity and we are writing a chapter of medical history.

Expert commentary
Hopefully we have convinced the readers that the conceptual advancement of precision medicine (individualized medicine) is no longer primarily limited by sequencing power. To a large extent, the molecular diagnostics of IRD has already been a major advance toward precision medicine with respect to identifying the causative genes and mutations for a large percentage of these conditions in affected individuals. To close the gap in the molecular diagnosis of these individuals, we are being challenged to abandon our prior simplistic and limited concepts of mutation mechanism and the classic Mendelian concepts of disease causality. With this understanding, the variation grouping of pathogenic; likely pathogenic; unknown significance; likely benign and benign cannot really reflect the complexity of disease presentations. A classification approach that is probabilistic and biologically based and that incorporates both independent and interactive genetic components would truly allow for individual-based precision medicine. However, we are a long way off from having the data and the analytical tools to build and test such models. The work of Exome Aggregation Consortium has started the journey. Genomics England 100,000 genomes project will likely become a game changer with its focus on patient genomes. Gradually, mutation analysis is being put into the context of race and gender. Ethnicity, whether self-defined by the individual or more accurately estimated from an individual's genome-wide genetic variants in comparison with other populations, will be a factor in deciding if some variants are likely to be disease-related or should influence our disease models. In a similar manner, the role of genetic burden (mutation load) may also be a factor for some conditions. With our current state of knowledge, IRD diagnostics does offer limited but real clinical benefits. What do we need to do to take it to the next level of being predictive rather than confirmatory? That is where the real 'jump' will have to take place. To truly realize the potential promise of precision medicine, we will need far more knowledge of the complex interactions of genes and their variants to adequately predict the onset, severity, rate of progression, and phenotype of genetic conditions before they are clinically evident and have already jeopardized human health. If we are truly committed to this level of genetics for precision medicine, we will need a concerted effort to gather much more extensive and complete genetic data from large numbers of patients, rather than focusing on a more selective approach of testing potential causative genes. Luckily, IRD is best suited to take the lead with its large patient cohort, more precise clinical descriptions and quantitative measures of severity and progression and multiple exciting avenues for clinical treatment. The joint force of clinical diagnosis, molecular diagnosis, and pharmacological genomics will offer an unprecedented tool to make medicine and clinical practice more precise.

Five-year view
The next five years will likely be a critical developmental stage for molecular diagnosis. There are major aspects of healthcare that will determine both the speed and direction that molecular diagnostic testing will develop. In the United States, there has been significant resistance to genetic testing by private insurers as well as the government, Medicare and Medicaid programs. While some countries with national health systems have come to the conclusion that there is a collective benefit in genetic testing of individuals with genetic disorders and have created national registries, the concerns over privacy and fears of exploding healthcare costs has limited the adoption of genetic testing as well as the appropriate levels of genetic counseling that are needed to use these tests. Medicare has consistently refused to establish reimbursement levels for both genetic testing and counseling, thus making it highly stressful for patients and their physicians to do testing. Third-party insurers routinely deny requests for these services, usually on the grounds that a molecular diagnosis for most genetic diseases will not alter patient therapy or clinical outcomes and that the gene testing panels are not FDA approved. It is highly likely that molecular diagnostic testing will continue to be underutilized and the acquisition of the population-based data that is really needed will be dependent on foreign countries until there are sufficient advances in gene-based therapies to force their adoption in the United States.
Even with new therapies beginning to emerge, there is a need to convince insurers that genetic testing of panels of genes or whole exome or genome testing are more clinically relevant and cost-effective than forcing physicians to guess which gene or genes should be tested for each clinical case. With clinical exome or genomic testing, we will have to grapple with the implications of discovery of genetic variants that will predispose for diseases that are not yet clinically evident. Groups have already begun to consider policies and guidelines for the counseling and use of the unanticipated results of widespread genetic testing. [66] Only with large-scale data acquisition and sharing will we be able to provide proper interpretations of disease risk. Only with a health system that is designed for preventive care will such information become clinically useful and cost-effective. At the level of data analysis, better data interpretation based on ethnicity, gender, and clinical presentations will be gradually realized. Mutation detection rate for IRD will not only be close to 100% and it may also become a test to rule out nongenetic RD conditions. After the entire genome sequenced and still no mutations identified, IRD can be most likely ruled out. The diagnosis of nongenetic RD conditions may also be improved by including the testing of microbiome, mRNA, and epigenetics.
A considerable effort will be required to train healthcare professionals to properly use and interpret these genetic tests. It is impossible to expect all clinicians to have the appropriate training to oversee and manage genetic testing and there are insufficient numbers of geneticists and genetic counselors to provide these services. We will need to establish qualifications for a larger group of clinicians to be able to use molecular genetic testing as part of their subspecialty care.
One ongoing barrier to insurance company adoption of gene panel-based genetic testing has been the lack of FDA oversight. When and how will the FDA finally regulate laboratory-developed tests (LDTs)? For rare genetic conditions, especially disease-specific panel approach, the rarity of tested conditions has always been a strong counterargument against FDA regulation. However, sequencing technology is gradually becoming a commodity and standardization and validation can finally be done at the WGS level as a condition un-specific platform. The FDA-cleared MiSeqDx instrument has become a market differentiating business strategy. FDA approval will in fact become an incentive but not a burden for large equipment companies. On the contrary, start-ups, even with groundbreaking technologies, may encounter higher barriers to enter the clinical testing market. At the laboratory level, economy of scale will likely prevail but the cost of testing may not come down significantly because oligopoly tends to discourage price competition. In the United States, insurance companies will likely embrace molecular diagnosis at least for some conditions such as IRD. Finding the main genetic causes of IRD can avoid unnecessary diagnostic work-ups and actually save money. Also due to the introduction of various therapeutic trials, the demand of molecular diagnosis of IRD will no doubt become stronger.
Financial & competing interests disclosure J Chiang acknowledges the support from clients and support from Casey Eye Institute by grant P30 EY010572 from the National Institutes of Health (Bethesda, MD), and by unrestricted departmental funding from Research to Prevent Blindness (New York, NY). The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.