Introduction

One of the biggest challenges in current medicine is the difficulty in preventing and treating a large number of chronic and complex diseases, such as diabetes, cardiovascular and cerebrovascular diseases, cancer, and Alzheimer's disease1. It has been widely demonstrated that the etiology of complex diseases is the result of a combination of multiple environmental and genetic factors2; thus, limited clinical outcomes may arise from intervention with one target among the complicated network of the disease. The need to apply multi-target combination therapeutic strategy has been increasingly recognized3. On the other hand, with the advance of genomics technology, precision medicine in recent years has aimed to achieve personalized treatment based on each individual's genetic disposition, which circumscribes the differing response to treatment4. Interestingly, traditional Chinese medicine (TCM) systematically examines the symptoms of the disease throughout the whole body, identifies the patterns or models of disease syndromes (Zheng), and prescribes a corresponding Fangji consisting of multiple herbs to individually treat patients. Therefore, Fangji are not only multi-targeted but are also used for personalized medicine, which has unique research and clinical value. The advances in high-throughput genomic and other omics technology provide powerful tools for investigating the systems biology of the multi-targeted and personalized nature of Fangji5.

Complex diseases, as complex or quantitative traits, are determined by many genes, most of which have small effects, and their interactions or interaction with the environment6. The contributing genes are called quantitative trait genes. The tremendously high diversity of genetic variation and its interaction with the environment contributes to the difference in pathogenesis and development of diseases7 and lays the foundation for the variety of patterns of TCM syndromes (or Zheng) of the disease, as well as the corresponding personalized treatment. Fangji typically consist of a number of herbs, according to the principles of Monarch, Minister, Assistant and Guide (Jun, Chen, Zuo and Shi) and other characteristics of compatibility. Each constituent herb usually contains hundreds or thousands of different compounds8. The overall efficacy is only infrequently accounted for by one or a small number of compounds. Rather, it is a synergy of various low-concentration, small-effect compounds, together with few enriched and active compounds. A Fangji, or its many small-effect active compounds, acts on the signaling systems controlled by many small-effect genes, modifying the molecular architecture susceptible to diseases, which may underlie the basis of the systems biology of personalized treatment with Fangji. Therefore, we can infer the particular molecular architecture that mediates the personalized medicine of Fangji by dissecting the genetic variants influencing the likelihood of response to Fangji, which can be implemented with genome-wide association studies (GWAS) and/or integrated with other omic approaches.

GWAS and full genetic model

The success and problems of GWAS

Based on the completion of human genome sequencing and the development of high-throughput genotyping techniques, GWAS analysis of complex diseases at the genome level comprehensively reveals the genetic factors and the network for pathogenesis and development of disease and drug response9. In a typical GWAS, each individual's whole genome is analyzed by using millions of single nucleotide polymorphisms (SNPs) as molecular genetic markers, and the genetic variants are associated with variations in the complex traits of interest in the population10. Human complex traits include physiological traits such as height and body weight, as well as disease traits such as hypertension and glaucoma, which are determined by a variety of genes, their environment, and their interactions11,12,13. Complex diseases, TCM Zheng, and response to drugs are all complex traits and can be studied using the GWAS approach. In 2005, the first important GWAS study reported age-related retinal macular degeneration14. Since then, findings from GWAS research have rapidly increased for different diseases, including cancer, diabetes, autism, systemic lupus erythematosus, psoriasis and response to drugs15,16. As of June 2017, nearly 3000 studies reported approximately 37 000 associations with 31 500 SNPs (http://www.ebi.ac.uk/gwas/home). As with other phenotypes, GWAS can directly associate different TCM phenotypes such as Zheng, constituent, or response to Fangji with genome-wide genetic variability and are powerful for resolving the molecular basis of TCM-related phenotypes. This analysis represents an unbiased investigation of the molecular substrates associated with the phenotype and thus does not require any assumptions about the mechanism, anatomy and physiology of the traits studied, which is particularly important and useful for TCM-related phenotypic studies because many of them are not clear.

Over the last few decades, GWAS analysis has revealed several important characteristics. The vast majority of associations were not located in the coding region of the functional protein, implicating that most of the associated variants were in regulatory loci17,18; additionally, each associated variant makes only a very small contribution to the disease phenotype19,20. These led to the conclusion that it is not the variants of the functional or core genes but the large number of low-contribution regulatory or peripheral genes that constitute the genetic network for complex disease21. This conclusion may need to be addressed for the simplistic precision medicine model, in which only one or a few number of signaling pathways are considered. Although GWAS has achieved great success, current GWAS still have significant limitations. One of the major limitations is so-called missing heritability22,23,24. GWAS has been expected to detect a large amount of association, as high-resolution genetic markers have been used in GWAS. However, for most disease phenotypes, the total heritability from association discovered by GWAS is very low, and thus the majority of associations cannot be detected25,26. For example, based on pedigree studies, human height is 80% genetically determined. Up to now, approximately 50 SNPs associated with human height have been detected, together accounting for only 5% of height variation. This result can be partly ascribed to disease heterogeneity, rare variation, epigenetics, etc27. An important reason, however, is that the current model for association analysis is insufficient.

The full genetic model of GWAS

According to the principle of genetics, the gene effect consists of an additive effect, a dominance effect and an epistatic effect. For the additive effect, each gene acts independently. For the dominance and epistatic effects, there is genetic interaction between alleles and between non-alleles, respectively. However, the existing GWAS analysis essentially analyzes only additive effects, failing to discover gene-gene and gene-environment interactions28,29. As these interactions play important roles in determining complex traits, inadequate consideration of these interactions by the current GWAS analysis significantly underestimates associations30,31,32. To address this, a novel mixed-liner GWAS analysis model considering various interactions has been developed and implemented33. This model has been applied for re-analysis of some GWAS data for complex traits, including human alcohol dependence, cholesterol level, body mass index, coronary heart disease, mouse anxiety-like behavior, crops, etc33,34,35,36. The results have demonstrated that, in addition to replicating associations reported previously, an appreciable number of new associations have also been detected, particularly in dominance or epistatic mode. Subsequently, the molecular architecture of those associated genes detected has been constructed. For example, alcoholism, or alcohol dependence, is a complex disease that is approximately 50% genetically determined37,38. However, until now, even though dozens of GWAS on alcoholism have been performed, only low heritability has been discovered using the conventional GWAS, less than 1%39. Chen et al29 reanalyzed the dataset from 3838 subjects to discover quantitative traits of alcohol dependence symptom count (ADSC), considering additive, dominance, and epistatic effects and their interactions with the environment. This reanalysis detected 20 quantitative trait SNPs associated with ADSC. Five associations have been previously reported from different studies. Additionally, the analysis revealed that the replicated association with the gene ADH1C was highly significant in a dominance inheritance mode and was predicted to increase the risk of ADSC at a considerably high level. Interestingly, an environmental factor, co-morbidity of substance dependence, also influenced the impact of the ADH1C variant on ADSC, dependent on the type of the substance dependence: only co-morbid opiate or marijuana dependence, but not nicotine or cocaine, showed the effect, indicating the complexity of the gene-environment interaction. Fifteen new associations have also been identified in variants, including four novel genes, two non-coding RNA and two epistasis loci. Two SNPs interacted in an additive × additive or additive ×dominance manner, with one within a gene, PTPRG, encoding a protein tyrosine phosphatase (PTP), the other one near a gene, ANGPT1, encoding a PTP receptor, supporting the validity of the model for estimation of the epistatic effect. Both ADH1C and ANGPT1 have been found to be significantly or nominally associated with alcohol dependence40,41,42, and other family members of PTP are associated with the disorder43. From the results of the reanalysis, over 20 percent of total heritability was detected, much closer to the results from family and twin studies; dominance and epistatic effects accounted for over 50 percent of the total estimated heritability. In contrast, in the first paper on GWAS using conventional analysis of the same dataset, no association reached genome-wide significant level44. To make the comparison more straightforward, GWAS of human total cholesterol level was carried out using a conventional single locus additive approach, including PLINK analysis and GCTA analysis. Only two and one significant association, respectively, reached genome-wide association level. Use of the full model, however, discovered 15 significant associations, and the dominance and epistatic effects accounted for approximately 60% of total heritability. Finally, simulations analysis further supported the validity of the full model35. In sum, the full genetic model has been demonstrated to improve the unbiased detection of a significant number of GWAS associations and has remarkably resolved the problem of missing heritability.

The full model can analyze not only the association from the sole dataset of genome but also from integrated data of multiple omics, including genome, transcriptome, proteome, metabolome, etc, to systematically identify the biological information flow of specific molecular architecture that influences the phenotype of interest33. Therefore, use of the full genetic model makes it plausible to comprehensively and systematically dissect how molecular structure, influenced by numerous small-effect genetic variability, is disturbed by the action of the environment or stress to govern the eventual occurrence of variations in the physiological phenotype, disease, TCM syndromes and drug response, thus offering comparatively high practical value in different fields, including Fangji study (Figure 1).

Figure 1
figure 1

Dissection of the molecular architecture of Fangjiomics using the full genetic analysis of genome-wide association study. (A) The multi-herbal Fangji and the rule of compatibility. Each herb consists of thousands of compounds, some of which have very small effects by themselves and collectively treat the disease. (B) The molecular architecture for individual difference in response to Fangji can be examined using the GWAS approach with full genetic analysis, in which dominance and epistatic (genetic interaction between alleles and non-alleles, respectively) effects are included. (C) The full genetic model analysis of GWAS is able to identify the efficacy or side effects of Fangji, with the variations of doses or herbal constituents considered. The analysis can also integrate other omic information, such as transcriptome or proteome, powerfully deciphering the systems biology of the personalized medicine of Traditional Chinese Medicine using Fangji. (D) The analysis can detect a large number of associated SNPs and genes, and the molecular architecture underlying Fangji's effect can be constructed.

PowerPoint slide

Perspective on the application of full genetic model GWAS to Fangjiomics

Identification of genetic factors underlying individual difference in response to drug treatment is key for precision medicine45,46. Paradoxically, GWAS of drug response, including adverse reactions to a drug, composes less than 10% of the total number of published GWAS reports and has achieved genome-wide significant association with less than 300 SNPs, averaging 2 or less per study47,48. Despite this limitation, these studies have still provided valuable information on the mechanisms underlying drug distribution, efficacy and toxicology49,50,51. For example, hepatic CYP2C19 enzyme, a CYP450 superfamily member, regulates the metabolism of many drugs, including antidepressants52,53, and their genetic variants were detected to have a significant effect on the choice of appropriate types and doses of clinical anticoagulants. Studies have also revealed the marked effects of ethnicity on drug response and toxicology54,55,56.

To reveal the mechanisms and factors that influence the effects of Fangji, GWAS is needed for unbiased identification of the complicated underlying molecular architecture. Currently, no GWAS regarding Fangji have been reported. Even for GWAS of conventional drug response, the number of the studies is still relatively limited, partly due to the requirement of the large sample size for detection of a considerable number of associations using the conventional GWAS approach, which is more difficult for drug response studies48. As mentioned above, this obstacle may be at least partly overcome by using full genetic model analysis, by which an appreciable number of significant associations can be disclosed so that molecular architecture for Fangji's action can be constructed without a very large sample size29. Based on the full genetic model GWAS analysis, it is possible to reveal the systems biology governing the distribution and pharmacodynamics of Fangji, as well as the associated genes and network of efficacy or side effects of Fangji33. Additionally, the full genetic model also offers more flexibility by allowing certain variations in Fangji in the model. This is very important, as in the clinical practice of TCM, there are constant changes in the composition of a given Fangji, called addition and subtraction, as well as changes in doses or frequency to individually treat the patients. In the full genetic model analysis, these changes in Fangji can all be used as covariates to reveal their involvement in the molecular and phenotypical impacts of Fangji. Furthermore, the full genetic model is also capable of dissecting the molecular architecture of the therapeutic or adverse effects of variable constituents or compounds of Fangji, or Fangjiomics, by using high-efficient phytochemical isolation and identification technology (Figure 1).

In conclusion, understanding the genetic networks and molecular architecture of Fangji remains a main challenge in the field. Recent progress in GWAS has proven very successful for unbiased dissection of new genetic underpinnings of different diseases, physiological traits and drug responses, although marked limitations in missing heritability have handicapped the capability to detect associations, partly due to the inability to capture gene-gene and gene-environment interactions using a simple additive model analysis. The development of a full genetic model considering influences from all genetic and gene-environment interactions has been very successful and substantially reduced the missing heritability evidenced by the conventional approach. It has been held that the action of Fangji is a result of synergy of many small-effect ingredients or compounds, in a way similar to a quantitative trait contributed by numerous small-effect genetic factors. It will thus be intriguing to clarify the molecular and genetic machinery underlying the mechanisms of Fangji and the phytochemical substrates of Fangjiomics. To this end, full genetic model analysis of GWAS from human studies or animal models is instrumental. GWAS, or GWAS combined with analysis of other omics, will offer novel insights with a systematic and dynamic bioinformatic flow regarding Fangji's mechanism.