RNA sequencing reveals MMP2 and TGFB1 downregulation in LRRK2 G2019S Parkinson's iPSC-derived astrocytes

Non-neuronal cell types such as astrocytes can contribute to Parkinson's disease (PD) pathology. The G2019S mutation in leucine-rich repeat kinase 2 (LRRK2) is one of the most common known causes of familial PD. To characterize its effect on astrocytes, we developed a protocol to produce midbrain-patterned astrocytes from human induced pluripotent stem cells (iPSCs) derived from PD LRRK2 G2019S patients and healthy controls. RNA sequencing analysis revealed the downregulation of genes involved in the extracellular matrix in PD cases. In particular, transforming growth factor beta 1 (TGFB1), which has been shown to inhibit microglial inflammatory response in a rat model of PD, and matrix metallopeptidase 2 (MMP2), which has been shown to degrade α-synuclein aggregates, were found to be down-regulated in LRRK2 G2019S astrocytes. Our findings suggest that midbrain astrocytes carrying the LRRK2 G2019S mutation may have reduced neuroprotective ca- pacity and may contribute to the development of PD pathology. contributing to the build-up of toxic α-synuclein species. The neurodegeneration that results from this toxicity may then activate a cascade of inflammatory responses, which midbrain astrocytes with reduced TGFB1 expression are unable to attenuate. Our findings thus propose a mechanism through which midbrain astrocytes influence the pathogenesis of PD. In conclusion, our study demonstrates for the first time that LRRK2 mutations lead to transcriptomic dysregulation in midbrain astrocytes. The identified gene expression changes point to a role for LRRK2 in regulation of the astrocyte extracellular matrix, with implications for neuroprotection in PD. Further studies will be necessary to understand how LRRK2 interacts with ECM proteins, and to realize the functional implications of our findings.


Introduction
Parkinson's disease (PD) is the second most common neurodegenerative disorder affecting approximately 1% of people over 60 years of age (de Lau and Breteler, 2006). PD progressively manifests itself in classical motor impairments including rigidity and tremor, but patients also show non-motor symptoms like cognitive impairment, psychiatric symptoms or sleep dysfunction (Khoo et al., 2013). While the majority of PD cases are idiopathic, mutations within a small number of genes underlie monogenic forms of the disease (Hernandez et al., 2016;Klein and Westenberger, 2012). The main pathological hallmarks of PD are accumulation of α-synuclein (Spillantini et al., 1998) and the selective loss of dopaminergic neurons in the substantia nigra pars compacta (SNpc) (Damier et al., 1999;Kordower et al., 2013).
Cellular models of PD have traditionally focused on dopaminergic neurons. However, there is growing evidence that other non-neuronal cell types such as astrocytes are involved in the degeneration of dopaminergic neurons and contribute to PD pathology .
Astrocytes represent one of the most abundant types of glial cells in the brain (von Bartheld et al., 2016), and play multiple key roles in a wide range of physiological processes ensuring healthy brain function. These processes include synaptic formation and maturation, regulation of neuronal synaptic transmission, maintenance of the blood brain barrier integrity and controlling the homeostasis of ions, neurotransmitters and energy metabolites (Khakh and Sofroniew, 2015;Oberheim et al., 2012;Sofroniew and Vinters, 2010). Several of the variant genes considered to be causative for PD, including PARK2 (Parkin) and PARK7 (DJ-1), have been found to be expressed in astrocytes and have important roles in astrocyte-specific functions such as inflammatory responses, glutamate transport and neurotrophic capacity . However, there is currently little knowledge of the involvement of leucinerich repeat kinase 2 (LRRK2) in astrocyte biology. Mutations in the LRRK2 gene represent the most common known genetic cause of PD (Hernandez et al., 2005;Paisán-Ruíz et al., 2004;Zimprich et al., 2004). The most common PD-associated LRRK2 mutation, G2019S, is found in approximately 4% of familial and 1% of sporadic forms of PD, and the risk of PD for carriers of the LRRK2 G2019S mutation is estimated at 25-43% (Goldwurm et al., 2011;Lee et al., 2017). Although it has been demonstrated that LRRK2 is expressed in purified human astrocytes at similar levels to neurons , no studies have been conducted to investigate the function of LRRK2 in human astrocytes.
Here, we present the first study of iPSC-derived midbrain-patterned astrocytes from PD patients carrying the common LRRK2 G2019S missense mutation. In order to understand the effect of this mutation on astrocyte function, we compared the gene expression profiles of iPSCderived midbrain-patterned astrocytes from PD patients with those from healthy controls, while controlling for differentiation efficiency. The gene expression data reveal changes in expression of genes involved in the extracellular matrix (ECM), which are largely downregulated in PD cell lines. Transforming growth factor beta 1 (TGFB1), which inhibits microglial inflammatory response in a rat model of PD (Chen et al., 2017), and matrix metallopeptidase 2 (MMP2), which has recently been shown to degrade α-synuclein aggregates (Oh et al., 2017), were found to be downregulated in PD astrocytes. These findings suggest that astrocytes in the brains of LRRK2 G2019S carriers may have reduced neuroprotective capacity, contributing to a less inhibited inflammatory response and an increased burden of extracellular α-synuclein aggregates, which in turn could contribute to PD pathology.

Participant recruitment and LRRK2 mutation screening
Participants were recruited to the Discovery clinical cohort through the Oxford Parkinson's Disease Centre and gave signed informed consent to mutation screening and derivation of iPSC lines from skin biopsies (Ethics committee: National Health Service, Health Research Authority, NRES Committee South Central, Berkshire, UK, REC 10/ H0505/71). All UK patients fulfilled the UK Brain Bank diagnostic criteria for clinically probable PD at presentation. Healthy controls were age-matched to within a decade where possible.

Culture and reprogramming of primary fibroblasts
Skin biopsies were obtained from four PD patients and three healthy control donors. Four iPSC lines were generated by retroviral reprogramming and three lines by Cytotune (sendai virus, see Table 1). Reprogrammed cell lines NHDF1, JR053-6, SFC840-03-03, JR036-1 and MK144-7 were characterized previously (Fernandes et al., 2016;Hartfield et al., 2014;Lang et al., 2019;Sandor et al., 2017). For SFC832-03-19 and SFC855-03-06, low passage fibroblast cultures were established from participant skin punch biopsies and reprogrammed using the CytoTune-iPS Sendai Reprogramming kit (Invitrogen) according to the manufacturer's instructions but scaled to 50,000 fibroblasts, as described previously (Fernandes et al., 2016). Clones were transitioned from initial culture on mitotically inactivated CF1 Mouse Embryonic Feeders (Millipore) to feeder-free culture in mTeSRTM medium (StemCell Technologies), on hESC-qualified Matrigel-coated plates (BD), and passaged as patches of cells using 0.5 mM EDTA in PBS. Bulk frozen stocks were tested for mycoplasma (Mycoalert, Lonza) and frozen below p20, or once Sendai virus was cleared. 10 μM ROCK inhibitor (Y27632, Bio-Techne) was added during thaw to promote survival and iPSC were cultured for maximum two weeks post-thaw prior to differentiation to ensure consistency. Characterization of previously unpublished iPSC lines was performed on bulk frozen stocks as earlier described (Fernandes et al., 2016). Briefly, flow cytometry for pluripotency markers TRA-1-60 (Biolegend #33061, Alexa Fluor 488), and Nanog (Cell Signaling #5448, Alexa Fluor 647) was performed on a FACSCalibur (BD Biosciences) (Suppl. Fig. 1). Sendai virus clearance was assessed using real-time PCR (Suppl. Fig. 1). The β-Actin qPCR Control Kit (Eurogentec) (92 bp product) was used as control for normalization. Primers used for assessment of Sendai virus clearance: Genome integrity and cell identity tracking used the Human CytoSNP-12v2.1 beadchip array or OmniExpress24 array (Illumina) on genomic DNA (All-Prep kit, Qiagen) analysed with GenomeStudio and Karyostudio software (Illumina) (Supplementary Fig. 1).

Quantitative real-time polymerase-chain reaction
RNA was extracted using the RNeasy Mini kit (QIAGEN) and cDNA was synthesized using SuperScript IV VILO Mastermix (ThermoFisher Scientific). Quantitative real-time PCR (qRT-PCR) was carried out using fast SYBR green mastermix (ThermoFisher Scientific) on a StepOnePlus machine (ThermoFisher Scientific) as per the manufacturers intructions. 2 -ΔΔCT values were calculated relative to the geometric mean of the reference genes B2M, SDHA and MALAT1. The following primer sequences were used:

Glutamate uptake
Sodium-dependent glutamate uptake was measured by incubating cells with 50 nM L-[3,4-3 H]-glutamic acid in 135 mM NaCl, 3.1 mM KCl, 1.2 mM CaCl 2 , 1.2 mM MgSO 4 , 0.5 mM KH 2 PO 4 , and 2.0 mM glucose (all Sigma), adjusted to pH 7.4, for 10 min. Sodium-independent uptake was measured by substituting NaCl for 135 mM choline chloride (Sigma). Uptake was stopped by the addition of 500 μM cold 1 H-glutamic acid. Cells were lysed with 0.1 M NaOH (Sigma). 3 mL Optiphase Supermix scintillation fluid (Perkin Elmer) was added before analysis using a Tri-Carb 2800 TR liquid scintillation counter (Perkin Elmer). Data were normalized to protein content, calculated using a BCA assay.

RNA-seq library preparation
For each sample, 386 ng total RNA was used for library preparation. RNA samples were prepared using standard poly-A isolation protocol (Illumina TruSeq Stranded mRNA kit) and, following pooling, sequenced on one lane of Illumina HiSeq4000 using 75 bp paired end reads, RNA samples showed consistently high RIN (RNA integrity number) values of at least 8.8. Sequencing resulted in approximately 50-56 million reads per sample. RNA-seq data are available at GEO under accession number: GSE120306.

Confirmation of sample identity
Individual identity of RNA-seq samples was confirmed by comparing genotypes obtained from SNP-array data from undifferentiated cell lines or original fibroblast cell cultures with genotypes derived from RNA-seq data. A high percentage of identical SNPs (> 90%) called by both SNP array and transcriptomic data confirmed sample identity.

Transcriptome analysis
Salmon (0.7.2) (Patro et al., 2017) with default settings was used to transcript expression using ensembl GRCh38 version 87 as the human transcriptome reference, which combined cDNA and ncRNA, and for which alternative sequences were removed before mapping. Gene annotation was obtained from the fasta headers of the ensembl reference used for mapping. Expression was summarised on the gene level using the tximport R package (1.10.1) (Soneson et al., 2015). Normalised counts were obtained from DESeq2 (1.22.2) (Love et al., 2014). Mapping rates against the human transcriptome were consistently high (90 to 93%), and the vast majority of mapped reads were assigned to protein-coding genes (93 to 95%). Differential gene expression analysis was performed with DESeq2 (1.22.2) using protein-coding genes; adjusted p-value cutoff 0.05. Two covariates were included in the design of the model: sex and differentiation efficiency as estimated by GFAP expression levels obtained from RNA-seq data. X-chromosomal genes were excluded from differential expression analysis (see below for further details). Enrichment analysis was performed with the R package ClusterProfiler (3.10.1) (Yu et al., 2012). Protein-protein interaction data were obtained from InBioMap (InWeb_InBioMap core 2016_09_12) (Li et al., 2017). The PPI network with its clusters was obtained with the R package igraph (1.2.3) (function cluster_walktrap) (Csardi and Nepusz, 2006), where proteins in clusters with fewer than 10 members Fig. 1. Differentiation of midbrain-patterned astrocytes. (A) Midbrain-patterned astrocyte differentiation protocol schematic. (B) Quantitative real-time polymerasechain reaction (qRT-PCR) analysis of pluripotency markers (POU class 5 homeobox 1, POU5F1; homeobox protein NANOG, NANOG), midbrain and neuronal transcription factors (FOXA2, LMX1A and PAX6) and astrocyte markers (GFAP, AQP4 and GLAST1) during the differentiation; graphs show average of four lines, mean ± SEM). (C) Immunostaining at DIV 21 for the nuclear marker DAPI and the floorplate markers FOXA2 and LMX1A. (D) Immunostaining at DIV 118 for the nuclear marker DAPI, the astrocyte glutamate transporter GLAST1 and the astrocyte cytoskeletal marker GFAP. Abbreviations: LDN, LDN-193189; SB, SB-431542; SHH, sonic hedgehog C24II; Pur, purmorphamine; FGF8a, fibroblast growth factor 8a; CHIR, CHIR99021; BGTDAC, B: brain-derived neurotrophic factor, G: glialderived neurotrophic factor, T: transforming growth factor beta 3, D: DAPT, A: ascorbic acid, C: N6,2′-O-dibutyryladenosine 3′,5′-cyclic monophosphate; EGF, epidermal growth factor; LIF, leukaemia inhibitory factor; FGF2, fibroblast growth factor 2.
were grouped into a cluster with the label '0'.
Post-hoc power analysis was performed using the R package RNASeqPower (1.22.1) (Hart et al., 2013) and applying the following parameters. Sample size and false positive rate were set to 3 and 0.05, respectively. The effect size (fold change) was tested with values of 2 and 3. The depth parameter was set to the average sum of total counts (mapped reads) across the six samples in million (46.3). The biological coefficient of variation in our dataset was estimated as 0.428, which is the square root of the estimated common dispersion computed by the estimateDisp() function of the edgeR R package (3.24.3) (Robinson et al., 2010).

Generation of heatmap for comparison with purified brain cell types
The heatmap is based on the expression levels of selected genes as used in Fig. S1G in (Sloan et al., 2017). The plot was generated with the pheatmap R package (1.0.12) (Kolde, 2015). Gene expresison levels from six iPSC-derived astrocytes and 29 purified brain cell types  were vst-transformed using the vst() function from DESeq2 R package and the batch effect was removed with removeBatchEffect() from limma R package (3.38.3) (Ritchie et al., 2015). The function pheatmap was called with clustering_method = "ward.D2", scale = "row" and cluster_rows = FALSE.

Distribution of gene expression on X chromosome and autosomes
Empirical cumulative distributions of X-chromsomal and autosomal gene expression in the four female iPSC-derived astrocyte samples is based on using quantile normalised gene expression data from transcript-per-million (TPM) abundances. First, quantile normalization was performed using normalize.quantiles() from the preprocessCore R package (1.40.0) (Bolstad et al., 2003), then, resulting data were log2transformed for plotting (stat_ecdf()). Pairwise Kolmogorov-Smirnov tests (ks.test() in R) were used to test for significant differences between samples of higher and lower XIST-expression. A similar approach, except using RPKM instead of TPM, was used in Patel et al. (2017).

Randomisation for PPI network analysis
The randomisation approach takes into account both the coding sequence length and the number of PPIs of the corresponding proteins. For every input gene a random gene that has a similar coding sequence length and similar connectivity of its corresponding protein was selected by applying a binning approach (Taylor et al., 2015). A total of 1,000,000 random gene sets, of the same size as the number of differentially expressed genes, were tested for their connectivity within the PPI resource (Li et al., 2017) and the number of connections were compared to the one obtained using the list of differentially expressed genes. The PPI analysis was performed on the basis of pairs of genes (using ensembl gene IDs as annotated in the downloaded data file), which means that an interaction between two genes is reported and counted if any of the possible gene products show an interaction.

External RNA-seq datasets
Public RNA-seq data for comparison were obtained for purified brain cell types  and iPSC-derived dopaminergic neurons (Sandor et al., 2017). Data were processed in the same way as RNA-seq data from this study (i.e. mapping to the identical human transcriptome using Salmon with the same settings, and summarising expression on the gene level using the tximport R package).

Differentiation of iPSCs into midbrain-patterned astrocytes
The dopaminergic neurons that selectively degenerate in PD are located in the SNpc, in the midbrain. Although no specific markers of midbrain astrocytes have been identified, it is considered that astrocytes arise from regionally patterned glial progenitors within the brain. The SNpc arises from a region of the developing brain known as the floorplate. We therefore used a modified version of the protocol developed by Kriks et al. (Beevers et al., 2017;Kriks et al., 2011) to pattern floorplate progenitors from monolayer iPSCs (Fig. 1A). Patterning involved dual-SMAD inhibition, and activation of sonic hedgehog (SHH), fibroblast growth factor 8 (FGF8) and WNT signaling (Fig. 1A). At day in vitro (DIV) 20 of the protocol, cells were passaged into medium containing epidermal growth factor (EGF) and leukaemia inhibitory factor (LIF) to induce proliferation (Sanalkumar et al., 2010). Coexpression of the floorplate markers forkhead box A2 (FOXA2) and LIM homeobox transcription factor 1 alpha (LMX1A) was confirmed by quantitative real-time polymerase-chain reaction (qRT-PCR) and immunostaining (Fig. 1B,C).
Progenitors were maintained in EGF and LIF until DIV 30, when medium was modified to contain EGF, FGF2 and Heparin (EFH medium; Fig. 1A). This combination was selected in order to maintain the proliferative state of the cells (Bonni et al., 1997;Sanalkumar et al., 2010;Tropepe et al., 1999). By DIV 33, FOXA2 and LMX1A expression were almost entirely lost, and expression of the pan-neuronal transcription marker paired box protein PAX6 (PAX6), which is expressed in tyrosine hydroxylase (TH)-negative (non-dopaminergic) cells in the developing midbrain (Duan et al., 2013), was detected (Fig. 1B). At DIV 34, cells were passaged and maintained in EFH medium, with passaging every 7 days until DIV 90 (Fig. 1A). PAX6 expression decreased gradually until DIV 69 (Fig. 1B), potentially indicating a switch from neuronal to glial progenitors. At DIV 90, cells were passaged into unsupplemented medium for terminal differentiation and maintained for 4-5 weeks without passaging (Fig. 1A). By DIV 118 Expression of the astrocyte cytoskeletal gene glial fibrillary acid protein (GFAP), the water channel aquaporin-4 (AQP4) and glutamate aspartate transporter 1 (SLC1A3; GLAST1) were detected by qRT-PCR (Fig. 1B). Immunostaining also demonstrated the presence of GFAP and GLAST1 proteins (Fig. 1D). We found that attempts to initiate terminal differentiation earlier than DIV 90, for example at DIV 41 or DIV 69, gave rise to few, if any, GFAP-positive astrocytes (data not shown).

Differentiation and characterization of LRRK2 G2019S iPSC-derived midbrain-patterned astrocytes
Following characterization of cells throughout the differentiation protocol, midbrain-patterned astrocytes were differentiated from iPSC lines generated from seven donors, including four patients with PD who carry the LRRK2 G2019S mutation, and three healthy control individuals (Table 1 and Supplementary Fig. 1). Batches of progenitors derived from each iPSC line were frozen at DIV 90 and used for the experiments described below. Following five weeks of terminal differentiation (DIV 125), the percentage of GFAP-positive astrocytes produced by each line was calculated ( Fig. 2A,B).
In vivo, astrocytes take up glutamate in a sodium-dependent manner through their glutamate transporters (Schousboe et al., 1977). Other proteins that are implicated in PD pathogenesis have been shown to influence glutamate uptake; deficiency of DJ-1 and astrocyte-specific overexpression of α-synuclein both result in impaired glutamate uptake in astrocyte cultures (Gu et al., 2010;Kim et al., 2016). To test the functionality of the astrocytes produced using our protocol, and to investigate whether the LRRK2 G2019S mutation had an effect on glutamate uptake, we measured sodium-dependent uptake of 3 H-glutamate in our cultures. Uptake was achieved by all lines; however, there was no significant effect of the LRRK2 G2019S mutation on this process (Fig. 2C,D).

Transcriptomic analysis
To further characterize our midbrain-patterned astrocytes, and to determine whether the LRRK2 G2019S mutation leads to transcriptomic changes in these cells, we collected samples for RNA-sequencing (RNAseq) analysis. Initial assessment of transcriptomic variability by principal components analysis (PCA) revealed PD1 as an outlying sample (Supplementary Fig. 2A). Evaluation of expression levels of typical pluripotency marker genes revealed unusually high expression of POU5F1 and NANOG in this sample. Levels were approximately 50 (POU5F1) and 30 (NANOG) times higher than the mean expression level in other samples (Supplementary Fig. 2B). These findings indicated residual pluripotency in PD1, so this sample was excluded from further analyses.
We also observed that two of the four female PD samples (PD2 and PD4) had much lower expression of X-inactive specific transcript (XIST) compared with the other female samples (Supplementary Fig. 2C). XIST is involved in the X-chromosome inactivation process in mammalian females and is expressed from the inactive X chromosome. Average expression of X-chromosomal genes showed a slight shift towards higher levels in low-XIST samples than high-XIST samples, indicating potential erosion of X chromosome inactivation (Supplementary Fig. 2D). In contrast, expression levels of autosomal genes did not show a similar deviation. This shift was not statistically significant at the 5% significance level according to pairwise twosample Kolmogorov-Smirnov tests. However, since the disease status (controls versus PD) of the four female samples was confounded by the observed trend, we excluded all X chromosome genes from our analysis.

iPSC-derived midbrain-patterned astrocyte gene expression profiles are similar to human adult astrocytes
We compared our iPSC-derived midbrain-patterned astrocytes with publicly available data from a set of purified human cortical cells including adult astrocytes, neurons, microglia, oligodendrocytes, endothelial cells and fetal astrocytes . PCA projection demonstrated that our iPSC-derived midbrain-patterned astrocyte samples exhibit a gene expression profile that resembles human adult astrocytes and that is distinctive from other brain cell types (Fig. 3A).  Heatmap showing the expression levels of selected marker genes (Sloan et al., 2017) across these cell types. (C) Expression levels of genes SPARC, LHX2 and EMX2 in purified cortical astrocytes and iPSC-derived midbrain-patterned astrocytes. Abbreviations: astro, astrocytes; micro, microglia; endo, endothelial; oligo, oligodendrocytes.
We performed hierarchical clustering of our iPSC-derived midbrainpatterned astrocytes and these human samples using genes that have been shown to be differentially expressed between different cell types in the central nervous system (Sloan et al., 2017). We found that our samples clustered closely with adult astrocytes, and were distinct from the fetal astrocyte group (Fig. 3B). Secreted protein acidic and cysteine rich (SPARC), LIM homeobox 2 (LHX2) and empty spiracles homeobox 2 (EMX2) have previously been identified as differentially expressed in midbrain astrocytes compared to other brain regions (Morel et al., 2017;Xin et al., 2019). Consistent with these studies, we found that SPARC was upregulated and LHX2 and EMX2 were downregulated in our iPSC-derived midbrain-patterned astrocytes when compared with purified cortical astrocytes (Fig. 3C).

LRRK2 is expressed at low levels in iPSC-derived midbrain-patterned astrocytes
Data suggest that LRRK2 expression is similar in isolated human cortical astrocytes and neurons . This comparison has not been made for cells in the human midbrain, or in iPSC-derived midbrain-patterned cultures. We therefore compared LRRK2 expression levels in our iPSC-derived midbrain-patterned astrocytes with data from our previously published iPSC-derived dopaminergic neurons (Sandor et al., 2017), and with purified human cortical cell types . We found that among these cell types, LRRK2 gene expression levels were highest in purified microglia (Supplementary Fig. 3). In comparison, LRRK2 expression in our iPSC-derived midbrain-patterned astrocytes was present but low, and similar to iPSC-derived neurons and purified cortical astrocyte samples.

The extracellular matrix is perturbed in LRRK2 G2019S iPSC-derived midbrain-patterned astrocytes
Genes that are differentially expressed between iPSC-derived midbrain-patterned astrocytes from PD LRRK2 G2019S cases and controls were identified using DESeq2 (Love et al., 2014). Due to concerns about erosion of X-inactivation, genes on the X chromosome were excluded from this analysis (see above). Two covariates were included in the design matrix: sex and differentiation efficiency; both of which correlated with separation of the samples in the PCA ( Supplementary  Fig. 4A,B). Differentiation efficiency was controlled for using expression levels of the astrocyte marker gene GFAP, which was highly correlated with the percentage of astrocytes produced by each cell line (Supplementary Fig. 4C). A total of 279 genes were identified as differentially expressed, 80% (223) of which showed reduced expression in PD samples. Analysis of the function of differentially expressed genes by Gene Ontology enrichment analysis found that the strongest signal came from biological processes of the ECM, which are down-regulated in the PD LRRK2 G2019S samples (Fig. 4A).

Neuroprotective potential may be lost in LRRK2 G2019S iPSC-derived midbrain-patterned astrocytes
Given convergent extracellular functionality among the differentially expressed genes (DEGs), we wanted to test whether the proteins encoded by these DEGs physically interacted with each other more than expected by chance. The protein-protein interaction (PPI) network between the gene products of all DEGs reveals that approximately 42% of proteins encoded by the set of DEGs interact with at least one other protein within that set (Fig. 4B). No simulated gene set of the same size (controlling for node degree and gene length, empirical p-value < 1e-06) had as many interactions as the set of DEGs identified, demonstrating the unusual interactivity of the DEG gene set. The majority of proteins within the PPI network (98 out of 108 proteins) are encoded by genes that are down-regulated in PD LRRK2 G2019S samples. All clusters within the PPI network are enriched for ECM organization (Fig. 4C). For individual clusters, the strongest signals were detected for ECM disassembly and kinase signaling pathways (Cluster 2), collagenrelated processes (Cluster 3), or cell adhesion (Cluster 4).
Of the most interconnected proteins in the PPI network, TGFB1 and MMP2 have both been previously implicated in neuroprotective mechanisms relevant to PD. The neuroprotective role of TGFB1 has been shown in the 1-methyl-4-phenylpyridinium (MPP+) rat model of PD, where TGFB1 treatment led to inhibition of microglial inflammatory response (Chen et al., 2017). MMP2 has been shown to degrade extracellular α-synuclein aggregates in vitro and in vivo, leading to a reduction in cellular apoptosis as compared with controls (Oh et al., 2017). These proteins were identified as the most inter-connected within cluster 2, with 10 and 9 PPIs, respectively.
In our transcriptomic dataset, as well as being highly interconnected in the PPI, both TGFB1 and MMP2 were expressed at significantly lower levels in midbrain-patterned astrocytes carrying the LRRK2 G2019S mutation than in healthy control samples; a finding that was confirmed by qRT-PCR (Fig. 4D). To test whether changes in expression levels of these two genes are specific to astrocytes, we reviewed the RNA-seq results of human iPSC-derived LRRK2 G2019S dopaminergic neurons (Sandor et al., 2017) and found that they were not differentially expressed. This was confirmed by qRT-PCR (Supplementary Fig. 5). Reduced expression of these genes indicates that astrocytes from patients with the LRRK2 G2019S mutation may have reduced neuroprotective capacity.

Discussion
Mutations in LRRK2 are a common cause of PD. Our study of iPSCderived midbrain patterned astrocytes found that LRRK2 was expressed in these cells, at similar levels to purified human cortical astrocytes and iPSC-derived dopaminergic neurons, and at lower levels than purified human microglia and oligodendrocytes. The LRRK2 G2019S mutation led to downregulation of genes involved in the ECM, which is essential for cell-cell communication through extracellular signaling (Vargová and Syková, 2014). This type of communication is important for initiating astrocyte response to neuronal activity, and modification of ECM composition can have profound effects on astrocyte response to inflammatory stimuli (Johnson et al., 2015).
TGFB1 and MMP2 are intrinsically involved in regulation of the ECM. We found that both of these genes were down-regulated in midbrain-patterned astrocytes carrying the LRRK2 G2019S mutation. Furthermore, their protein products were highly connected in our PPI network, suggesting they may have a key role in LRRK2 G2019Smediated changes in midbrain astrocytes. MMPs modulate cell-cell and cell-ECM interactions by degradation or proteolytic activation of ECM proteins (Baker et al., 2002), and TGFB1 has been shown to induce expression of MMP2 in a range of cell types including astrocytes (Dhar et al., 2006;Hua et al., 2016;Takahashi et al., 2014). In the presence of interleukin-1β (IL-1β), it has been shown that TGFB1 induction of MMP2 proceeds through downregulation of Tissue Inhibitor of Metalloproteinase (TIMP)1 (Dhar et al., 2006). TIMP proteins (TIMP1-4) are known to inhibit the activity of MMPs (Arpino et al., 2015), and higher expression of TIMPs have been associated with lower levels of MMPs. In our model this was not the case, TIMP2 and TIMP4 were downregulated and TIMP1 and TIMP3 expression was unchanged. This finding may be explained by the fact that in addition to its role in inhibiting MMPs, TIMP2 has been shown to mediate MMP2 activation (Lafleur et al., 2003). The finding that TIMP2 is downregulated in our model may therefore reflect reduced activation, rather than reduced inhibition of MMP2. TGFB1 and MMP2 have both been shown to exhibit neuroprotective effects that are relevant to PD. TGFB1 treatment has been shown to inhibit microglial inflammatory response in the MPP+ rat model of PD (Chen et al., 2017). Downregulation of TGFB1 expression in our model may therefore point to impaired regulation of inflammatory responses in the brains of LRRK2 G2019S carriers. Recently, it has been shown that MMP2 degrades α-synuclein in vitro and in vivo and leads to a reduction in cellular apoptosis (Oh et al., 2017). Reduction of MMP2 expression in our LRRK2 G2019S iPSC-derived midbrain-patterned astrocytes versus controls is supported by similar findings in the substantia nigra of post-mortem brains from PD patients (Lorenzl et al., 2002).
Our study likely under-reports the degree of altered gene expression in patient-derived LRRK2 G2019S astrocytes. Power analysis shows that three samples per group provide an estimated 47% power to detect a two-fold gene expression change rising to 84% for genes with a fold change of three or more, which is the case for both MMP2 and TGFB1 (see Methods). A key issue underlying study power here is the high biological coefficient of variation across samples due in part to culture cellular heterogeneity (see Supplementary Fig. 4), which has been shown to contribute significantly to differences detected between iPSCderived cell populations (for example, Volpato et al. (2018)). Future studies would benefit from the development of cell sorting approaches or the application of single cell sequencing to reduce this confound.
Evidence for the midbrain identity of our cultures was obtained using midbrain astrocyte markers recently identified in two different studies (Morel et al., 2017;Xin et al., 2019). Morel et al. showed upregulation of sparc and downregulation of lhx2 and emx2 in ventral versus dorsal mouse astrocytes, while Xin et al. identified the same trend in mouse ventral midbrain versus cortical astrocytes. We found that the human orthologs of these markers exhibited the same expression pattern in our iPSC-derived astrocytes when compared to purified cortical astrocytes.
Together, these data suggest that midbrain astrocytes of PD patients may have a reduced ability to degrade α-synuclein, thereby contributing to the build-up of toxic α-synuclein species. The neurodegeneration that results from this toxicity may then activate a cascade of inflammatory responses, which midbrain astrocytes with reduced TGFB1 expression are unable to attenuate. Our findings thus propose a mechanism through which midbrain astrocytes influence the pathogenesis of PD.
In conclusion, our study demonstrates for the first time that LRRK2 mutations lead to transcriptomic dysregulation in midbrain astrocytes. The identified gene expression changes point to a role for LRRK2 in regulation of the astrocyte extracellular matrix, with implications for neuroprotection in PD. Further studies will be necessary to understand how LRRK2 interacts with ECM proteins, and to realize the functional implications of our findings.