iPhemap: an atlas of phenotype to genotype relationships of human iPSC models of neurological diseases

Abstract Disease modeling with induced pluripotent stem cells (iPSCs) is creating an abundance of phenotypic information that has become difficult to follow and interpret. Here, we report a systematic analysis of research practices and reporting bias in neurological disease models from 93 published articles. We find heterogeneity in current research practices and a reporting bias toward certain diseases. Moreover, we identified 663 CNS cell‐derived phenotypes from 243 patients and 214 controls, which varied by mutation type and developmental stage in vitro. We clustered these phenotypes into a taxonomy and characterized these phenotype–genotype relationships to generate a phenogenetic map that revealed novel correlations among previously unrelated genes. We also find that alterations in patient‐derived molecular profiles associated with cellular phenotypes, and dysregulated genes show predominant expression in brain regions with pathology. Last, we developed the iPS cell phenogenetic map project atlas (iPhemap), an open submission, online database to continually catalog disease phenotypes. Overall, our findings offer new insights into the phenogenetics of iPSC‐derived models while our web tool provides a platform for researchers to query and deposit phenotypic information of neurological diseases.

. Heatmap of the relationship between phenotypes and developmental stages in vitro. Colored boxes denote the relationship of similar genes at stages of differentiation. Phenotypes are in the X-axis and disease abbreviations are on the Y-axis, grouped by their disease category: neurodegenerative (red) neurodevelopmental (green), and viral-induced and psychiatric (black). The majority of phenotypes reported in neurodegenerative diseases were in mature cell types, while phenotypic abnormalities were observed through all cell types in genes linked to neurodevelopmental disorders. Quantifications can be found in Figure 4 and a full table of phenotypes, organized from left to right, in Appendix Table S4. Appendix Figure S2. Statistical pathway analysis of overlapping phenogenetic network. A) Node degree distribution plot of overlapping network. Fitting these data with a power-law distribution returned the fitted line equation, y=53.358x -1. 163 , with the following statistical parameters: R 2 = 0.717 and r= 0.922, which suggests that this network is in accord with a scale-free network (n=64, F(1,62)=3.85). B) Shortest path length distribution plot illustrating the smallest number of phenotypes between genes. C) Neighborhood connectivity (n=23, F(1,21)=11.91) and D) Topological coefficient distribution (n=100, F(1,98)=70.71) are both statistically significant for a power-law distribution, which demonstrates that a majority of phenotypes are distinct to a single gene within the network.

Topological Coefficient
Appendix Figure S3. Extended patient-derived iPSC phenogenetic map version 2017, including non-overlapping genes. Phenotypes are colored by taxonomy described in Figure EV2 and identifying numbers of each phenotype are listed in Appendix Table S10. This extended network view was generated through the same method of analysis as     Figure S9A-C. Expression patterning of dysregulated genes from iPSCs. Heatmaps show localization of dysregulated gene expression in the progenitor structures (SVZ) of the prenatal human brain, the late weeks of postconception in the brain during development, and in the pons and myencephalon (My) adult human brain from iPSCs with A-C) FXN gene mutations. HTS, Hindbrain transient structures; SG, Subpial granular zone; SP, subplate zone; SS, sulci and spaces.
Appendix Figure S9D-F. Expression patterning of dysregulated genes from iPSCs. Heatmaps show localization of dysregulated gene expression in the developing cortex (CP, SP) and progenitor zones (SVZ) of the prenatal human brain, expression in the very early weeks of postconception during development, and in the cerebellum and globus pallidus (CxN) in the adult human brain from iPSCs with D-F) SNCA  Note: In cases with multiple genes studied, the patient quantities reported under the "Number of Disease Patients Studied" are listed in the same order as the "Gene/ Mutation" column.
Appendix Table S2. Categorical Cluster Descriptions

Impairment of expected cellular functions
This category contains any phenotype that can be described by the presence of a disrupted/changed state of a structure or process that is expected and found in a healthy version of the same cell and cannot be described in terms of increases or decreases. i.e. Impaired structure of adherens junctions (PMID: 24996170) Absence of expected normal phenotype This category contains any phenotype that can be described by the complete loss of a function that is found in a healthy version of the same cell. i.e. Absence of random Xinactivation (PMID: 21372419)

Decreased cellular processes and products
This category contains any phenotype that can be described by a decrease in the rate and/or the decreased output of any process and the products it creates within a cell. i.e. Decreased levels of co-localized LC3/LAMP-1 (PMID: 22407749) Increased susceptibility to chemical exposure This category contains any phenotype that can be described by an increase in a cell's susceptibility to death and other harmful events after being exposed to a specific chemical. i.e. Increased susceptibility of neurons to valinomycin (PMID: 22764206) Rescue/recovery from disease phenotypes after chemical treatment This category contains any phenotype that can be described by the returning of a disease phenotype to the normal healthy cellular phenotype after the cell is treated with a certain chemical. i.e. Rescue of aberrant cellular parameters after treatment with LRRK2-In-1 (PMID:

23075850) Presence of abnormal cellular structures
This category contains any phenotype that can be described by the observation of a disrupted or altered cellular structure not observed in a healthy cell. i.e. Presence of constricted/ tapered neurites (PMID: 24319659) Decreased susceptibility to chemical exposure This category contains any phenotype that can be described by the decreased in a cell's susceptibility to death and other harmful events after being exposed to a specific chemical. i.e. Decreased susceptibility to PI3K inhibitor after ectopic progranulin expression (PMID: 23063362) Increased cellular processes and products This category contains any phenotype that can be described by an increase in the rate and/or the increased output of any process and the products it creates within a cell.
i.e. Increased spontaneous dopamine release (PMID: 22314364) Accumulation of molecules This category contains any phenotype that can be described by the abnormal accumulation of molecules in a cell not seen in healthy cells. i.e. Accumulation of α-synuclein (PMID: 24905578) Appendix   Increased cell viability of neurons exposed to rotenone after treatment with rapamycin 445 Increased cell viability of neurons exposed to NMDA after treatment with rapamycin 446 Increased cell viability of neurons exposed to Aβ (1-42)