Generation of an Open-Access Patient-Derived iPSC Biobank for Amyotrophic Lateral Sclerosis Disease Modelling

Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disease affecting the upper and lower motor neurons, causing patients to lose control over voluntary movement, and leading to gradual paralysis and death. There is no cure for ALS, and the development of viable therapeutics has proved challenging, demonstrated by a lack of positive results from clinical trials. One strategy to address this is to improve the tool kit available for pre-clinical research. Here, we describe the creation of an open-access ALS iPSC biobank generated from patients carrying mutations in the TARDBP, FUS, ANXA11, ARPP21, and C9ORF72 genes, alongside healthy controls. To demonstrate the utilisation of these lines for ALS disease modelling, a subset of FUS-ALS iPSCs were differentiated into functionally active motor neurons. Further characterisation revealed an increase in cytoplasmic FUS protein and reduced neurite outgrowth in FUS-ALS motor neurons compared to the control. This proof-of-principle study demonstrates that these novel patient-derived iPSC lines can recapitulate specific and early disease-related ALS phenotypes. This biobank provides a disease-relevant platform for discovery of ALS-associated cellular phenotypes to aid the development of novel treatment strategies.


Introduction
Since their discovery in 2007, human induced pluripotent stem cells (iPSCs) have broadened our understanding of basic biology and transformed the field of developmental biology [1]. iPSCs harbour similar properties to embryonic stem cells and can be guided to differentiate into various specialised cell types with high levels of purity, but they are derived from somatic cells. iPSC-derived neurons can be generated in as little as three days using doxycycline-inducible transcription factor systems or in two weeks using chemically defined conditions [2,3]. In addition, advances in 3D organoid and co-culture systems mean we can model diseases with greater complexity than ever before in human cells [4].
iPSCs hold great promise for drug screening and personalised medicine [5] and can be derived from patients with neurodegenerative diseases such as Alzheimer's disease, Parkinson's disease, and amyotrophic lateral sclerosis (ALS) [6][7][8]. iPSCs are readily accessible, have unmatched capabilities when modelling specific cell types in vitro, and are not burdened with the ethical concerns typically associated with human embryonic stem cells [5]. In addition, the generation of genetically modified iPSC lines increases the versatility of stem cell models of complex genetic disorders. This can be achieved with CRISPR/Cas9 technology by correcting genetic mutations in patient-derived lines or inserting mutations into well-characterised healthy control lines [9]. Patient-derived iPSC lines can give rise to a large spread in experimental data due to the inherent genetic variability present across cell lines derived from different individuals. This is somewhat avoided with the generation of isogenic cell lines with CRISPR/Cas9 technology. However, careful considerations must be made when utilising these models, such as analysis of off-and on-target effects [10,11]. These can cause variability in CRISPR/Cas9-generated iPSC lines, resulting in cellular phenotypes that are not necessarily associated with the mutation being modelled. Further, it is important to instil a measured interpretation of phenotypes that have been observed within a single genetic background when extrapolating experimental results from isogenic lines. Combined analysis of CRISPR/Cas9-generated and patient-derived iPSC lines may be essential for the elucidation of real cellular phenotypes from non-specific signals that can arise from clonal and experimental variability. Therefore, the generation of iPSC lines from patients with a range of genetic disease risk factors will broaden the scope of pre-clinical research and improve therapeutic targeting for a larger patient cohort.
ALS is a neurodegenerative condition which affects the upper and lower motor neurons [12], with a global mortality rate of approximately 30,000 patients each year [13]. In the UK alone, the total number of newly diagnosed ALS cases is estimated to rise from 1701 in 2020 to 2635 in 2116 [14]. ALS has no cure, and over 50 clinical trials (CTs) have failed over the past 25 years [13,15]. Currently, Riluzole is the only disease-modifying drug for ALS, extending survival by 6-19 months, and it received commercial authorisation in Europe in 1996 [16]. A retrospective study found that Riluzole-mediated survival occurs in the last clinical stage of ALS, indicating that increased lifespan occurs during the disease stage where symptoms are most severe [17]. Co-treatment with Riluzole and the tyrosine kinase inhibitor Masitinib slowed the rate of functional decline compared to Riluzole plus placebo in a phase 2/3 clinical trial [18], indicating that combination therapy may be beneficial in ALS. The drug Edaravone received approval from the FDA in the USA in 2017 [19] but did not yield a positive outcome in a multicentre clinical trial in Italy [20]. The lack of viable and novel therapeutics for ALS indicates that more research and better drug-target identification are essential.
ALS is characteristically complex and heterogeneous which has likely contributed to the overriding failure of ALS-CTs. ALS subclassification and stratification may increase the success rate of CTs [21,22], which might be achieved through patient phenotyping and genotyping. Screening patients for known genetic causes of ALS is particularly relevant when developing treatment strategies that target specific genomic aberrations, such as gene therapies, including antisense oligonucleotides (ASOs) [23]. The first ASO for treatment of SOD1-ALS, Tofersen, was recently approved by the FDA; a milestone in drug development for ALS [24]. Additional data suggest that ASOs targeting specific genetic mutations in SOD1, C9ORF72, FUS, and ATXN2 hold potentially beneficial clinical outcomes [25], and patient-derived cells can be used to test novel therapeutics such as ASOs. Neurons and glia remarkably recapitulate key pathological features associated with ALS including increased neuronal death and mislocalisation of disease-associated proteins to the cytoplasm [26][27][28][29]. These cellular phenotypes represent useful markers of drug efficacy in disease-relevant cell types, indicating how improving the availability and variety of patient-derived iPSCs may enhance the translational impact of research.
Patient-derived cell biobanks allow us to determine disease-related phenotypes and test the toxicity and effectiveness of novel compounds. Such a biobank was created in 2003 by The Motor Neurone Disease Association, named The UK MND Collections. This biobank contains more than 3000 blood samples including lymphoblastoid cell lines (LCLs) and peripheral blood lymphocytes (PBLs) from healthy volunteers and ALS patients, alongside clinical and epidemiological information [30]. LCLs are immortalised cell lines derived from peripheral B lymphocytes infected with Epstein-Barr virus (EBV) [31]. LCLs are an excellent source of DNA and are a useful tool for in vitro experimentation; their stable genome and transcriptome properties, inexpensive maintenance, and easy manipulation make LCLs great value for disease research [32]. However, LCLs do not represent the cell types affected by disease when modelling neurodegenerative conditions such as ALS.
To address this, we repurposed The UK MND Collections as a resource for generating patient-derived iPSC lines.
Here, we describe the generation of this ALS-iPSC biobank alongside proof-of-concept data generated with FUS-ALS patient-derived iPSC-neurons to demonstrate the prevalence of disease-specific early phenotypes in patient-derived cells. Thirty-five iPSC lines were generated as part of The UK MND Collections, which can be openly accessed through The Motor Neurone Disease Association.

iPSC Characterisation
iPSCs were subject to quality control (QC) experiments, including pluripotency immunocytochemistry and embryoid body assay (see below). Patient-derived lines were sequenced at the mutation region by sending PCR products and primers to Source Bio-Science (London, UK). Genomic integrity was assessed by G-band or digital karyotyping: G-band karyotyping was outsourced to TDL Genetics (London, UK) or to the Genome Editing and Embryology Core (King's College London, UK), and digital karyotyping was completed with KaryoStat™ Karyotyping Service (Thermo Fisher). To confirm the loss of EBV genes, iPSC lines were serially passaged, and genomic DNA was screened for EBV genes with PCR. DNA was extracted using the DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany), and PCR was performed using primers targeting EBV genes (EBNA2, LMP1, BZLF and oriP) [34] and the housekeeping gene SDHA, using Q5 ® High-Fidelity 2× Master Mix (NEB, MA, USA) with 35 cycles of 95 • C for 30 s, 61 • C for 30 s, and 72 • C for 30 s. PCR products were separated by gel electrophoresis in 4% agarose gels with 1% ethidium bromide, then visualised and photographed inside a UV transilluminator. STR profiling was outsourced to Source BioScience (London, UK); 16 STR loci were analysed and matched between iPSCs and parent LCL or PBL lines.

Imaging and Analysis
Images were acquired with a Leica CYR5000 light microscope (Leica Microsystems, Wetzlar, Germany), a Leica TCS-SP5 microscope (Leica Microsystems, Germany), or an Opera Phenix ® High-Content Screening System (PerkinElmer, Waltham, MA, USA). Motor neuron quantification was performed in the Columbus™ Image Data Storage and Analysis system (PerkinElmer, USA). Approximately 8-10 fields from two separate wells of a 96-well plate were quantified in three biological replicates. Neurite outgrowth was quantified in individual neurons using the ImageJ plug-in NeuriteTracer.

Calcium Imaging
Motor neurons were cultured in 96-well plates and aged for 108 days. Cells were incubated with 2 mM Fluo4-AM in an external solution (145 mM NaCl, 2 mM KCl, 5 mM NaHCO 3 , 1 mM MgCl 2 , 2.5 mM CaCl 2 , 10 mM glucose, and 10 mM Na-HEPES (pH 7.25)) and 0.02% Pluronic-F27 (Thermo Fisher) for 15 min at 37 • C. Subsequently, neurons were rinsed in fresh external solution for another 15 min at 37 • C. Live image collection was performed with an Opera Phenix ® High-Content Screening System with a 20× water objective. Data were collected for 2 min, with one image taken every 2.8 s, and data were processed in ImageJ. Spontaneous calcium fluctuations were calculated as relative Fluo4-AM fluorescence intensity normalised to the background (F-F0/F0) across 10 regions of interest (ROI).

Generation of iPSCs from LCLs and PBLs
Twenty-four novel iPSC lines were generated from a selection of healthy control and ALS patient LCLs, and five iPSC lines were derived from PBLs (Table 1). An additional six patient-derived LCL-iPSC lines also constitute part of the UK MND Collections and have been reported [36]. Across the collection and within the ALS group, LCLderived iPSCs were generated from five patients with mutations in ANXA11 (2× G38R, 2× D40G, 1× R235Q), three with TARDBP mutations (1× M337V, 1× G348V, 1× N378D), four with C9ORF72 GGGGCC intronic expansions, three with mutations in FUS (1× R519E, 1× R521H, 1× R522G), and four with mutations of unknown significance in ARPP21 (3× P529L, 1× P713L). The remaining line is derived from a sporadic ALS patient with no known genetic mutation. An additional ten control lines were generated from healthy individuals (5× male, 5× female). Five PBL-derived iPSC lines were generated from the same donor cohort, including two healthy control and three ALS patient-derived lines: Two with mutations in TARDBP (1× G348V, 1× N378D) and one with a mutation of unknown significance in ARPP21 (1× P529L). Healthy donor and ALS patient demographic information for the newly generated lines is included in Table 1.

Characterisation of Newly Derived iPSCs
Newly generated LCL-and PBL-derived iPSC lines were subject to standard QC testing. Example data are included in Figure 1, and individual QC data are available alongside cell lines. A summary of QC results for all cell lines is included in Table 2. All iPSCs showed typical morphology with small round cells with large nuclei growing in defined colonies ( Figure 1B). Immunocytochemistry targeting the pluripotency markers OCT3/4 and Nanog indicated the expression of stem cell-specific proteins in iPSCs ( Figure 1A). The embryoid body assay was included as an additional measure of pluripotency: iPSCs were allowed to spontaneously differentiate, and cultures were probed for cells originating from the three layers of the blastocyst. Embryoid bodies were immunolabelled for the mesodermal protein smooth muscle actin (SMA), the endodermal marker alpha-fetoprotein (AFP), and β3-Tubulin was used to identify cells derived from the ectoderm ( Figure 1C).
Cell line identity was confirmed using STR profiling to ensure that daughter iPSC lines correctly matched parent LCL or PBL genetic profiles (in line with patient data protection). In addition, ALS patient-derived lines with known genetic mutations were directly sequenced and their genotypes were verified ( Figure 1F). To confirm the loss of EBV genes in LCL-derived iPSCs, clonal lines were serially passaged, and genomic DNA preparations were interrogated for the persistence of EBV genes ( Figure 1G). Where possible, if EBV genes were still present in iPSC lines beyond passage 30, the clonal line was discarded, and a new iPSC clone from the same donor was selected. The serial passaging in this QC step had the potential to introduce cell line abnormalities, and so was completed prior to any pluripotency or genomic analyses.  Cell line identity was confirmed using STR profiling to ensure that daughter iPSC lines correctly matched parent LCL or PBL genetic profiles (in line with patient data protection). In addition, ALS patient-derived lines with known genetic mutations were directly sequenced and their genotypes were verified ( Figure 1F). To confirm the loss of EBV genes in LCL-derived iPSCs, clonal lines were serially passaged, and genomic DNA preparations were interrogated for the persistence of EBV genes ( Figure 1G). Where possible, if EBV genes were still present in iPSC lines beyond passage 30, the clonal line was discarded, and a new iPSC clone from the same donor was selected. The serial passaging in this QC step had the potential to introduce cell line abnormalities, and so was completed prior to any pluripotency or genomic analyses.
Measures of genome stability included either G-band karyotyping ( Figure 1D) or digital karyotyping ( Figure 1E). G-band karyotyping was initially included to confirm the absence of gross chromosomal changes that might arise during reprogramming. This was Measures of genome stability included either G-band karyotyping ( Figure 1D) or digital karyotyping ( Figure 1E). G-band karyotyping was initially included to confirm the absence of gross chromosomal changes that might arise during reprogramming. This was extremely labour intensive and low -throughput; therefore, KaryoStat™ digital karyotyping was subsequently included to circumvent these issues. KaryoStat™ was deemed a suitable alternative as although it cannot detect balanced translocations, it offers an increased indel resolution compared to G-banding and is able to detect culture mosaicism with a limit of 30%. Additionally, if indels are detected in cell lines, the affected loci are identified, and so the potential consequences of structural changes can be assessed based on the functions of affected genes and any known disease associations.

Phenotypic Screening
To investigate whether iPSC lines generated for this biobank can recapitulate key pathological features of ALS, we performed a preliminary analysis of iPSCs derived from two FUS-ALS patients. iPSCs carrying FUS R521H and R522G and two control lines were differentiated into motor neurons using small molecule mediated differentiation. iPSCs were first differentiated into OLIG2-positive motor neuron progenitors via an intermediate neuroepithelial stage (Figure 2A,B). Terminal differentiation was achieved by Notch inhibition, giving rise to neuronal cultures with~70% β3-Tubulin positive cells and~50% Islet 1 positive motor neurons ( Figure 2B-D).

Phenotypic Screening
To investigate whether iPSC lines generated for this biobank can recapitulate key pathological features of ALS, we performed a preliminary analysis of iPSCs derived from two FUS-ALS patients. iPSCs carrying FUS R521H and R522G and two control lines were differentiated into motor neurons using small molecule mediated differentiation. iPSCs were first differentiated into OLIG2-positive motor neuron progenitors via an intermediate neuroepithelial stage (Figure 2A,B). Terminal differentiation was achieved by Notch inhibition, giving rise to neuronal cultures with ~70% β3-Tubulin positive cells and ~50% Islet 1 positive motor neurons ( Figure 2B-D). Motor neurons were cultured for 108 days and assessed for spontaneous calcium fluctuations as an indirect measure of synaptic activity. All lines presented with calcium Motor neurons were cultured for 108 days and assessed for spontaneous calcium fluctuations as an indirect measure of synaptic activity. All lines presented with calcium transients, indicating the functional activity of neurons ( Figure 2E,F). FUS R521H and R522G and control motor neurons were immunolabelled for FUS protein on day 21 of differentiation ( Figure 3A), indicating a relative increase in FUS protein in the cytoplasm in patient-derived lines compared to controls ( Figure 3B). Neurite outgrowth was assessed in young motor neurons on day 21 of differentiation, revealing a decrease in total neurite length in FUS patient-derived lines compared to controls (Figure 4).
Genes 2023, 14, 1108 10 of 15 transients, indicating the functional activity of neurons ( Figure 2E,F). FUS R521H and R522G and control motor neurons were immunolabelled for FUS protein on day 21 of differentiation ( Figure 3A), indicating a relative increase in FUS protein in the cytoplasm in patient-derived lines compared to controls ( Figure 3B). Neurite outgrowth was assessed in young motor neurons on day 21 of differentiation, revealing a decrease in total neurite length in FUS patient-derived lines compared to controls (Figure 4).   transients, indicating the functional activity of neurons ( Figure 2E,F). FUS R521H and R522G and control motor neurons were immunolabelled for FUS protein on day 21 of differentiation ( Figure 3A), indicating a relative increase in FUS protein in the cytoplasm in patient-derived lines compared to controls ( Figure 3B). Neurite outgrowth was assessed in young motor neurons on day 21 of differentiation, revealing a decrease in total neurite length in FUS patient-derived lines compared to controls (Figure 4).

Discussion
Thirty-five new iPSC lines derived from patients with ALS and healthy controls have been generated and characterised. These iPSCs were derived from patients with mutations in the FUS, C9ORF72, TARDBP, ARPP21, and ANXA11 genes, and from one sporadic patient. LCLs and PBLs from The UK MND Collections were utilised as a resource for the generation of this biobank.
LCLs are B lymphocytes that have been immortalised by infection with EBV, a lymphotropic herpesvirus [38]. In the majority of latent human infections, EBV exists episomally in the nucleus [39], however, integration into the host genome can occur in cases of Burkitt Lymphoma and other malignancies [40][41][42]. Other EBV-associated diseases, such as mononucleosis, are not typically associated with host genome integration, and in many cases EBV infection does not cause disease [39]. One of the EBV elements, EBNA-1, influences the chromatin architecture of infected cells, creating an "open" chromatin state [43], which may facilitate transcription factor activation and iPSC reprogramming in infected LCL lines [44]. One remarkable characteristic of iPSCs generated from LCLs is that the EBV elements are lost after passaging ( Figure 1G) [34,35]. The mechanism by which the EBV elements are lost is yet to be completely understood; we hypothesise that these iPSCs are derived from individual lymphoblastoid cells where viral genes have not integrated, and EBV episome loss from explanted nasopharyngeal carcinoma cells has been reported [45]. However, if and how viral episomes are lost from iPSCs is undetermined.
Loss of EBV genes from iPSCs is a reproducible phenomenon, as indicated here where most iPSCs derived from ALS patients and control LCLs lost EBV genes before passage 30 ( Figure 1G) [34,35]. However, EBV genes could be detected in genomic DNA extracts beyond passage 30 in approximately 25% of the screened iPSC clones. In these instances, it is possible that the EBV elements had integrated into the genome of the original lymphoblastoid, and thus the daughter iPSC. Subsequent analysis of EBV integration was not conducted in these instances. Prior infection with EBV was recently associated with multiple sclerosis, indicating that the presence of EBV genes might contribute to motor neuron pathophysiology [46]. To circumvent any possible confounding effects of EBV elements on cellular phenotypes, clonal iPSC lines expressing EBV genes beyond passage 30 were discarded, and a new clone was selected for characterisation. This was unexpected and time-consuming, which should be considered when utilising LCLs as a resource for iPSC reprogramming in future studies. Screening for EBV or other viral genes is not necessary when iPSCs are derived from primary cells such as PBLs. No qualitative differences were observed in either the success of iPSC characterisation or in routine iPSC culture in lines derived from each cell type. This suggests that where both materials are available when generating iPSCs from a desired genotype, PBL-derived iPSCs might be a more time-and cost-effective choice. No thorough comparison of LCL versus PBL iPSCs or differentiated cell types is reported here, and additional phenotyping will be necessary to solidify this assertion.
The selection of patient tissue for reprogramming was influenced by the lack of iPSC lines representing certain genotypes for ALS research. In particular, cell lines with mutations in ANXA11 and ARPP21 have not been previously reported, apart from some ANXA11 patient-derived lines that constitute part of this same collection [36]. Mutations in ANXA11 have a proven association with ALS [47][48][49], and the generation of these cell lines will be an important step in elucidating the role of the corresponding protein, Annexin A11, in motor neuron pathogenesis. Mutations in ARPP21 have been identified in ALS patients [50], but the significance of these mutations is unconfirmed [51,52]. The utilisation of these newly generated lines will be essential in confirming the true contribution of ARPP21 to the ALS genetic landscape. Concurrently, if mutations in ARPP21 do not contribute to ALS risk, these cell lines represent sporadic and familial ALS patients with an unknown genetic burden.
Additional lines were generated from patients with well-established genetic associations, namely TARDBP, FUS, and C9ORF72. Multiple iPSC lines with mutations in these genes have been instrumental in progressing our understanding of the cellular pathologies associated with ALS, and these data have been excellently reviewed elsewhere [8,[53][54][55]. iPSC lines generated for this biobank will add to this resource, providing greater opportunities for the identification of cellular pathology and pre-clinical validation of new therapeutics. Newly generated FUS-ALS iPSC-derived neurons mirror the cytoplasmic FUS mislocalisation seen in ALS post-mortem tissue and other FUS-ALS disease models, including other iPSC-derived models (Figure 3) [56][57][58][59]. Neurite outgrowth or branching defects have been observed in previously established FUS-iPSC models, showing variable results, with some reports of increased branching and length in FUS lines [60,61], and others indicating reduced complexity and outgrowth [62]. In addition, increasing the number of reliable control lines is essential when utilising patient-derived iPSCs with variable genetic backgrounds. Therefore, ten healthy control lines were generated from donors above the age of 60 years, ranging from 61-84 years of age and derived from males and females presenting with no neurological or health condition at the time of blood collection.
In summary, we have generated an open-access iPSC biobank, including multiple lines generated from ALS patients and controls. These can be accessed through The Motor Neurone Disease Association (https://www.mndassociation.org/research/for-researchers/ resources-for-researchers/ukmndcollections/ (accessed on 10 May 2023)) alongside QC data for each cell line. In addition, uncharacterised clones from the same cell lines may be available to those interested in the comparison of clonal lines ( Table 2). As an example of utilisation of these lines, we have shown that motor neurons derived from FUS-ALS patients recapitulate key pathological features observed in ALS patient tissue and other FUS models. Hence, these newly generated lines will aid ALS disease modelling, contributing to pre-clinical research on developing novel therapeutics.

Informed Consent Statement: Not applicable.
Data Availability Statement: Cell lines can be accessed via The Motor Neurone Disease Association by following the guidance and application form included in the following link: https://www. mndassociation.org/research/for-researchers/resources-for-researchers/uk-mnd-collections (accessed on 10 May 2023).