Parkinson’s disease risk enhancers in microglia

Summary Genome-wide association studies have identified thousands of single nucleotide polymorphisms that associate with increased risk for Parkinson’s disease (PD), but the functions of most of them are unknown. Using assay for transposase-accessible chromatin (ATAC) and H3K27ac chromatin immunoprecipitation (ChIP) sequencing data, we identified 73 regulatory elements in microglia that overlap PD risk SNPs. To determine the target genes of a “risk enhancer” within intron two of SNCA, we used CRISPR-Cas9 to delete the open chromatin region where two PD risk SNPs reside. The loss of the enhancer led to reduced expression of multiple genes including SNCA and the adjacent gene MMRN1. It also led to expression changes of genes involved in glucose metabolism, a process that is known to be altered in PD patients. Our work expands the role of SNCA in PD and provides a connection between PD-associated genetic variants and underlying biology that points to a risk mechanism in microglia.


INTRODUCTION
Genetic studies suggest that a significant portion of Parkinson's disease (PD) risk is heritable (estimated to be up to 36%). 1 Unlike some rare disorders caused by highly penetrant mutations in a small number of genes, PD genome-wide association studies (GWASs) have uncovered thousands of low-penetrance single-nucleotide polymorphisms (SNPs) with more modest influences on disease risk.It is hypothesized that their small effects play a role in PD predisposition through subtle changes in gene expression over the course of an entire lifespan.There are thus ongoing efforts to understand the effects of these (and other) DNA risk variants prior to disease onset.
The findings of GWASs have been challenging to interpret as not all PD-associated SNPs are biologically functional in relevant cellular and developmental contexts.Moreover, most of them are in non-coding DNA. 2 Some variants within enhancers and promoters are known to influence gene expression regulation. 3However, the genes whose expression they target are not immediately apparent. 4Thus, the first challenges in the field are to 1) determine which SNPs are imposing PD risk by altering biology and 2) identify mechanisms by which each allele of a risk SNP leads to biological differences.
A large portion of PD risk SNPs are enriched in regulatory enhancers and promoters across multiple cell types, implying that PD susceptibility may in part be attributed to genetic variation that impacts the regulation of cell-type-specific genes and cellular processes. 5Microglia are the resident macrophages in the brain and are known contributors to neuroinflammation in PD. 6 PD-associated variants reside in regulatory elements or are in and around PD-associated genes in these cells. 7,8However, the variants that are functional and how they influence microglia processes are largely unknown.It has been demonstrated that SNPs within regulatory elements alter transcription factor binding and gene expression. 3,9,10The effects of subtle genotype-dependent gene expression may impact microglia functions and ultimately disease risk over the course of a lifetime.
The goal of our study was to identify PD-associated SNPs that potentially function via allele-dependent regulation of gene expression in microglia.We first mapped open chromatin in induced pluripotent stem cell (iPSC)-derived microglia using assay for transposase accessible chromatin with sequencing (ATAC-seq).We combined our data with published ATAC-seq data from primary microglia 11 to create a consensus list of open chromatin regions.We then overlapped these consensus regions with published H3K27ac chromatin immunoprecipitation sequencing (ChIP-seq) datasets from primary ex vivo microglia tissue 11 to demarcate active enhancers and promoters.These regions overlapped 73 out of 6,749 ''SNPs of interest'' published in the latest GWAS metanalysis. 1 We report these as candidates for in-depth mechanistic evaluation in microglia. 12or thorough functional analysis, we chose to focus on one of our top candidate regulatory elements, an intragenic ''risk enhancer'' at SNCA, defined by its overlap with a PD risk SNP.GWASs of PD have uncovered many risk variants at the SNCA locus which make up at least three independent association signals. 13,14However, distinct functional variants at this locus have yet to be identified in microglia.In this study we report on two variants, rs2737004 and rs2619356, that we believe to be functional in microglia.They are in linkage disequilibrium (LD) with a PD-association signal spanning the 5 0 end of SNCA into MMRN1.CRISPR/Cas9 deletion of the open chromatin region, containing these variants, led to reduced expression of SNCA and the adjacent gene MMRN1, confirming a regulatory effect on nearby genes.In addition, there was a small subset of differentially expressed genes involved in cell-cycle regulation and glucose metabolism, which are two linked processes interactions happen between the portion of SNCA that contains the risk enhancer and active regulatory DNA at MMRN1 and GPRIN3.The interaction profile is different in neurons, where some of the contact points are present within the H3K27ac signal.However, the interactions appear to be more localized around SNCA and the intergenic region between SNCA and GPRIN1.In oligodendrocytes there are few 3D interactions, all of which are between SNCA and GPRIN1 (data not shown).This suggests that the SNCA risk enhancer functions differently in microglia than in neurons or oligodendrocytes and highlights the cell-specific activity of the risk enhancer.
Both SNPs of the "risk haplotype" could have a synergistic influence on gene expression levels of SNCA or other genes via allele-specific binding to transcription factors.As a first approximation for such a mechanism, we used MotifbreakR 18 to find transcription factors that are predicted to have differences in binding strength to the protective and risk alleles of rs2737004 and rs2619356.The most frequent motif altered by rs2737004 genotype was CTCF, which shows a preference for G (the risk allele) (Figure 2B; Table S2).RAD21, a core subunit of the cohesin complex, also shows a preference for the G allele (Figure 2B; Table S2).To provide supporting evidence of CTCF and RAD21 binding at rs2737004, we searched ChIP-seq data from ReMap. 21Although microglia are not a cell type in the ReMap database, multiple other cell types, including peripheral macrophages (a related cell type), show binding of CTCF at rs2737004 (Figure 2C orange bars).There is also evidence of RAD21 binding at the same location (Figure 2C blue bars).CTCF and RAD21 work together to mediate 3D chromatin structure, forming loops between enhancers and promoters to facilitate chromatin accessibility and gene expression. 22,23A plausible mechanism leading to increased risk for PD could thus involve changes in 3D chromatin organization that impact the expression of multiple genes.
The MotifbreakR analyses also showed that rs2619356 influences the affinity for many transcription factors (Table S2).Of the transcription factors that have a strong preference for the T-allele (part of the risk haplotype), CEBP proteins, specifically CEBPB and CEBPE, appeared most frequently (3 motifs) (Figure 2B; Table S2).There was no ReMap data for CEBPE.However, CEBPE binds to a motif at rs2619356 in multiple cell types, including macrophages.CEBPB is a basic-leucine zipper transcription factor that regulates pro-inflammatory responses in microglia. 24It has also been shown to bind at SNCA and promote its expression in a neuroblastoma cell line. 25Others have linked SNCA dysregulation to allele-specific transcription factor occupancy. 3,26It is hypothesized that this mechanism leads to subtle increases in SNCA expression over a long period of time, contributing to alpha-synuclein aggregation later in life.These data point to a similar mechanism.However, this analysis is a preliminary step in demonstrating the functionality of rs2619356 and rs2737004.Moving forward we focused on identifying the target genes of the risk enhancer created by CRISPR/Cas9-mediated deletion of the open chromatin region of the risk enhancer.The gene is part of the PARK16 locus (one of the first to be identified by PD GWAS).
Its expression is downregulated in PD patients, but its causal mechanisms are unknown. 58ble of top-ranking SNPs (based on GWAS p value) in open chromatin at active regulatory regions of DNA in microglia.See also Table S1 and Figure S2.
The SNCA risk enhancer controls expression of SNCA, MMRN1, and a network of additional genes in microglia When defining risk genes, a common approach is to implicate the nearest genes to the SNP.Whereas this approach has provided substantial insight into PD-related pathways, it may not reveal the full extent of gene targets because an enhancer may affect genes multiple kilobases away on linear DNA or even on different chromosomes. 27To determine the target genes of the enhancer in which the risk haplotype resides, we created a 439 bp deletion of the open chromatin region containing rs2737004 and rs26191356 (Figure 1B).We then performed bulk RNA sequencing (RNA-seq) at three time points across differentiation of microglia: day 0 iPSCs, day 12 hematopoietic progenitors (HPCs), and day 40 microglia.Sample names, their collection times, and total number of technical and biological replicates can be found in Figure S4B.There were no significant differences between standard microglia marker genes AIF1 (IBA1), TMEM119, CD11b, and P2RY12 in wild-type compared to edited cell lines (Figure S5C).Note that TMEM119 mRNA expression was low in microglia.The use of this marker as a robust indication of microglia has been recently challenged due to its variability, both increased and decreased, depending on activation status. 28We also confirmed protein expression by immunocytochemistry (ICC) of TMEM119 and IBA1 in all cell lines (Figure S5B).We did not do a formal quantification due to inconsistencies related to timing and cell losses during the staining procedure.Looking across time points, we observed that the number of genes influenced by the enhancer deletion increased with the differentiation time course (Figure 3A and related Table S1).There was also little overlap between differentially expressed genes at each stage, suggesting that either the enhancer is not as active at earlier time points or it controls different sets of genes at each stage in differentiation.Interestingly, SNCA and MMRN1 expression significantly decreased in microglia but was unchanged in iPSCs and HPCs (Figure 3A), indicating a unique role for the risk enhancer in controlling expression of SNCA and MMRN1 in microglia.Our findings also corroborate the PLAC-seq data showing that MMRN1, in addition to SNCA, is a likely target of the risk enhancer in microglia.The PLAC-seq data indicated that GPRIN1 could be a target gene.However, its expression was not significantly different at any time points after the enhancer deletion.
Genes with a differential expression false discovery rate (FDR) of %0.1 in iPSCs and HPCs showed no enrichment of pathways identified by GOnet gene ontology (GO) analysis tool. 29Using the same cutoff in microglia, the only pathway to show enrichment was DNA conformation change (FDR = 1.2e-2), which is described by the Gene Ontology Resource as ''a cellular process that results in a change in the spatial configuration of a DNA molecule.''A conformational change can bend DNA or alter the twist, writhe, or linking number of a DNA molecule. 30'' Seven out of the 42 genes, CENPK, DNA2, HELLS, DSCC1, CENPU, RAD54B, and MCM8, were a part of this network.The genes that had an FDR value %0.05 are displayed and underlined on the heatmap in Figure 3B.Looking more specifically, they are known to be involved in DNA replication and are likely indicative of changes in the cell cycle.This was confirmed using GO enrichment from the Gene Ontology Resource (FDR = 2.66E-03). 30We also observed an inverse relationship between GYG2 (Glycogenin 2) and PFKP (Phosphofructokinase) expression  S3 for results.(B).Heatmap of Z scores calculated from TMM-normalized log2 counts per million (log2 CPM).The genes in the heatmap represent those that have a Benjamini-Hochberg FDR cutoff of less than 0.1 (less stringent than used for the volcano plots in part A), and a fold change (FC) of over 1.4 (log FC > 0.5).The final plot was made by ranking the genes by log FC and taking the 15 genes at the top and bottom of that list.
(Figure 3B red arrows).GYG2, a gene involved in glycogen biosynthesis, was most significantly upregulated while PFKP, a gene involved with the conversion of fructose 6-phosphate to fructose 1,6-bisphosphate at the beginning stages of glycolysis, was downregulated.This, along with changes in cell-cycle genes, suggests that loss of the enhancer could be affecting glucose metabolism and possibly shifting glucose utilization to glycogen storage rather than glycolysis.This is a phenotype that we plan to explore in future experiments.

DISCUSSION
Here we aimed to determine a subset of PD-associated risk SNPs located in regions of active regulatory DNA in microglia and to identify functional risk SNPs in this cell type.In doing so, we substantially narrowed down the 6,749 PD-associated "SNPs of interest" from the latest PD metanalysis to a more tractable list of 73.We chose one candidate risk locus, SNCA, for more in-depth evaluation based on its overlap with a top-ranking candidate SNP, rs2737004 (GWAS p value = 7.6x10^-11, OR = 1.14), in addition to the relevance of SNCA to PD risk, which is not well understood in microglia.
Multiple PD GWASs have reported independent association signals at the SNCA locus, but it is still unclear which variants are functional around these signals. 1,14,31The top-ranking GWAS hit, rs356182 (p value = 1.85 x 10 À82 ), is located at the 3 0 end of SNCA in a regulatory element identified in neurons. 26Our lab previously demonstrated that heterozygous deletion of rs356182 causes changes in the expression of SNCA and thousands of additional genes, many of which are related to neuronal differentiation. 26However, that same 3 0 enhancer is not active in microglia and none of the risk variants near rs356182 overlap microglia H3K27ac and ATAC signals.Alternatively, our analysis points to a potential functional SNP in microglia, rs2737004, that appears to segregate independently of rs356182 (D' = 0.6786, R 2 = 0.0816).Rs2737004 is linked to a separate independent GWAS signal, near the 5 0 end of SNCA. 13,14It is in LD with the top tag SNP rs763443 (D' = 0.9484).We speculate that the risk signal tagged by rs763443 represents a group of functional variants in microglia (or other immune cells), and our analysis pinpoints rs2737004 as a top functional candidate.In contrast, the risk signal at rs356182 may represent functional variants in other cells such as neurons.In the larger context of PD, we imagine a scenario where multiple nearby genetic risk signals each function through SNCA, but these separate signals represent biology relevant to different cell types.Thus, each set of regulated genes is unique, resulting in dysregulation of different cellular processes.For these reasons, we believe that, even though rs356182 carries the strongest association with PD risk, it is not the most relevant candidate for follow-up in microglia and more focus should be placed on understanding the function of PD-associated variation at the independent signal at the 5 0 end of SNCA.
Upon further evaluation of the SNCA locus, we located an additional SNP, rs2619356, in the risk enhancer that is in strong LD (D' = 1) with rs2737004.Interestingly, one version of the haplotype is non-existent in the study population and the risk allele of rs2737004 only appears with the T allele of rs2619356, indicating co-segregation/linkage of the two alleles.Rs2619356 was not published in the original list of ''SNPs of interest'' likely due to the use of R 2 as a measure of LD, which results in the exclusion of SNPs with allele frequencies that are not close to 50%.In terms of our approach to include functional SNPs based on their location in open chromatin at active regulatory DNA, rs2619356 is positioned in the center of the open chromatin region and transcription factor binding ReMap signal (Figure 2C).It is also predicted to have allele-specific preference for multiple transcription factors (Table S2), making it an equally relevant candidate for mechanistic follow-up.
A plausible mechanism of the risk enhancer could involve alterations in chromatin looping mediated by allele-specific CTCF and RAD21 binding.The G allele of rs2737004 has a stronger preference for CTCF and RAD21, which could promote the stability of the loop to facilitate SNCA and MMRN1 expression.In 3 out of the 4 tissues profiled (breast, esophagus, and pituitary) by the Genotype-Tissue Expression project (GTEx), the GG genotype of rs2737004 is associated with increased expression of SNCA.The presence of CEBP within the motif for rs2619356 and its preference for the allele of the risk haplotype supports this idea, as CEBP has also been shown to promote the expression of SNCA.Studies, primarily in neurons, demonstrate that elevated expression of SNCA leads to alpha-synuclein aggregation and cellular dysfunction.However, the function of endogenous SNCA in microglia is not well known.SNCA has multiple roles that could require time-specific or varying levels of SNCA expression in different cell types.For example, alpha-synuclein is highly expressed in neurons and required for many important neuronal functions such as differentiation and synaptic transmission. 32Alpha-synuclein has more recently been shown to facilitate immune responses and thus may be lowly expressed under homeostatic conditions in immune cells like microglia but becomes upregulated on a temporal basis in response to stress or infection. 33,34In different cellular contexts, SNCA may therefore be regulated differently via cell-type-specific enhancers.This could explain the differences we observed in topology and regulatory signals like H3K27ac at SNCA when comparing microglia to neurons and provides a hypothetical scenario for how PD-associated variants within regulatory elements increase disease risk in distinct cell populations.However, this has not been directly demonstrated and how SNCA regulation is carried out in different cell types and different contexts such as viral infection and stress is still an early area of research.
Enhancer looping to target promoters is one of the critical aspects of proper gene regulation.Although the resolution of PLAC-seq data is not precise enough to distinguish whether the interaction of MMRN1 is with the risk enhancer, the promoter of SNCA, or both, there does appear to be evidence for loop formation between SNCA and MMRN1 (Figure 2A).This points to MMRN1 as a potential target gene.We further believe that MMRN1 is a plausible target, as deletion of the enhancer led to loss of MMRN1 expression in addition to reduced SNCA expression.Whether the expression changes in the remaining set of genes are due to alterations in primary enhancer interactions or secondary downstream effects is still an open question.
SNCA expression changes, both up and down, have been associated with PD risk, but fewer studies have focused on how MMRN1 relates to PD risk.MMRN1 is mostly known to function in platelets to aid in coagulation, but there is limited understanding of its physiological functions. 35Although the gene has been mostly studied in the context of cancer metastasis, a transcriptome-wide association analysis recently identified MMRN1 as a gene whose expression associates with PD risk. 36MMRN1 genetic abnormalities have also been found in autosomal dominant PD. 37 Surprisingly, SNCA and MMRN1 expression was not significantly different in iPSCs or HPCs, whereas in microglia, deletion of the enhancer led to a loss in their expression.We hypothesize that the enhancer promotes or maintains SNCA and MMRN1 expression as the cells differentiate into microglia.This suggests a biologically important role for both SNCA and MMRN1 in microglia, but more studies are needed to corroborate this hypothesis and to understand their function in PD.
To evaluate whether our differentially expressed gene set contained any genes that are known to be dysregulated in PD patients, we looked for overlap with published gene sets from cells with genetic abnormalities in SNCA. 38We confirmed overlap of 5 genes in our dataset with a gene set from A53T SNCA mutant dopaminergic neurons.In addition to SNCA, MMRN1, DTL, and DPPA4 were upregulated whereas PFKP was downregulated in comparison to wild-type control cells.In a PD patient-derived dopaminergic cell line with an SNCA triplication, SNCA was upregulated, but VWCE, EDA2R, PUS7L, and MGAM2 were downregulated.This suggests that changes in SNCA may affect a common set of genes in dopaminergic neurons and microglia.However, the changes we observed in our model may not be completely related to SNCA expression changes but more related to the loss of the enhancer that controls a specific set of genes, including SNCA, that are relevant to microglia function.
Two other genes stood out due to their involvement in glucose metabolism, which is a key process in microglia that controls their activation in response to inflammation. 39The most upregulated gene, GYG2, is involved in glycogen biosynthesis.We also observed a loss of PFKP, which is a critical regulatory enzyme in glycolysis. 40Our GO analysis identifying cell cycle as a process affected by 7 of the other differentially expressed genes supports a role for this metabolic phenotype in PD risk as metabolism and cell cycle are highly interconnected. 41On a related note, out of the 73 SNPs in microglia regulatory DNA, the one with the most significant GWAS p value is at SLC50A1 (Table 1; Figure S2), a glucose transporter whose function has never been studied in microglia.There have been clinical findings showing that PD patients have reduced glucose metabolism at early stages of the disease. 42Strikingly, altered glucose metabolism in diabetes patients was also found to increase the chances for developing PD by 30%. 43Our results and the findings of others justify the need for more studies to understand how dysfunctional glucose metabolism in microglia leads to increased risk for PD as this pathway may be a promising therapeutic target in PD.
To the best of our knowledge this is the first functional evaluation of a ''risk enhancer'' near the PD-association signal at the 5 0 end of SNCA.How SNCA, MMRN1, and other genes are regulated by this enhancer may play an important part in PD pathogenesis by impacting inflammatory functions in microglia.Our data also provide a starting point for dissecting genetic risk at other loci and demonstrate the importance of careful evaluation of PD-associated variants on a cell-type-specific basis.We advocate for more post-GWAS testing of these risk variants to make sense of the genetic contribution to increased PD risk.There are currently no treatments to modify the progression of PD.Additional studies that build on our findings will help understand the complex genetic etiology of PD and identify alternative diseasemodifying targets.

Limitations of the study
Although we demonstrated a role for the risk enhancer in controlling a network of genes, the mechanisms of the SNPs within enhancers remain to be determined.We attempted to create single-base-pair edits to generate isogenic cell lines with different genotypes of rs2737004.However, due to evidence of confounding effects on gene expression (possibly off-target edits), we did not move forward with that portion of the study.This limitation is worth noting because of the challenges related to technical constraints to single-base-pair-editing, requiring a PAM sequence near the targeted nucleotide.The choice of guide sequences is much more restricted which increases the chances for off-target edits.Future follow-up studies using cell lines with germline variation, similar to the methods of Langston et al., 44 may be a better approach.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:

EXPERIMENTAL MODELS AND STUDY PARTICIPANT DETAILS iPSC cell lines
For ATAC-seq experiments, induced pluripotent stem cells (iPSCs) were obtained from ATCC (ACS-1019, DYS0100, male neonate).For CRISPR editing experiments, iPSCs were obtained from Synthego (PGP1-SV1, Male age 55).All validation and QC of cell lines were performed by the supplier.We have not authenticated these cells following receival.Conditions for culturing both cell lines include 5% CO 2 at 37 C.At the iPSC stage, cells were cultured in StemFlex medium on either Geltrex LDEV-free reduced growth factor basement membrane for the ATCC cell line or iMatrix for the PGP1 cell line.When cells reached 80% confluency, they were passaged using ReLeSR

METHODS DETAILS iPSC differentiation to HPCs
iPSCs were first differentiation to hematopoietic progenitors (microglia precursors) using the STEMdiff hematopoietic kit per the methods detailed by McQuade et al. 15 On day -1 iPSCs are seeded in 6-well plates and allowed to adhere overnight.On day 0, StemFlex is removed and replaced with medium A. On day 2 half of the medium was changed out with fresh medium A. On day 3 all, media was changed to medium B. Half medium changes were then done on days 5, 7, and 10.Cells were harvested on day 12 and assessed for CD43 expression using Flow.

Flow analysis of HPC markers
To confirm iPSC differentiation to HPCs on day 12, cells were stained for CD43.We followed the ''Cell Surface Flow Cytometry Staining Protocol'' from Biolegend.Supernatant and non-adherent cells were collected from each well of a 6-well plate.They were then pelleted and re-suspended in 5 mL for counting.Single suspensions of 200,000 cells were prepared in up to 15 mL of Cell Staining Buffer.Cells were centrifuged at 350 x g for 5 minutes and supernatant was discarded.The pellet was then resuspended in Cell Staining Buffer (100 ml/ # of conditions).Five microliters of TruStain FcX (Fc Receptor Blocking Solution) was added to each sample followed by an incubation at room temperature for 5-10 minutes.After incubation, 200,000 cells per condition were aliquoted into culture test tubes (Fisherbrand 12 3 75mm), one for CD43 and one for the non-stained control.Five microliters of CD43 was then added to one sample and allowed to incubate for 15-20 min in the dark.Cells were washed two times with at least 3 mL of Cell Staining Buffer.At the final wash the pellet was resuspended in 300 ul of Cell Staining Buffer plus 10.9 mM DAPI at a concentration of 3uM.Using a Beckman Coulter CytoFLEX S Flow cytometer.We determined that all cultures were pure if over 90% of cells assayed were positive for CD43 (per McQuade et al. 15 ).See Figure S3B and S5 A for Flow results on the ATCC and PGP1 cell lines respectively.
The pellet was resuspended in 1mL/well fresh medium plus 3 cytokines and added back to the same plate.Media was supplemented again with 1 mL/ well fresh microglia media plus 3 cytokines on days 14, 16, 18, 20, 22, and 24.On day 25 all but 1 mL was removed from each well and spun down at 300 rcf for 5 minutes.Cells were then resuspended in 1 mL/ well fresh microglia medium plus 5 cytokines (100 ng/mL IL-34, 50 ng/mL TGFb, 25 ng/mL M-CSF, 100 ng/mL CD200 and 100 ng/mL CX3CL1).Cells were collected on day 28 for ICC and RNA-seq.

Immunocytochemistry and imaging microglia
Following differentiation of HPCs to microglia, expression of microglia-specific markers were confirmed using a mouse anti-human Iba1 primary antibody with Alexa Fluor 488 goat anti-mouse secondary antibody.We also used a rabbit anti-human TMEM119 primary antibody with Alexa Fluor 594 goat anti-rabbit secondary antibody.The staining procedure differed for each cell line.For the ATCC cell line, cells were plated in 24-well plates on glass coverslips coated with PEI and allowed to adhere for 24-hours in the incubator (5% CO 2 at 37 C).They were then fixed with 4% paraformaldehyde and permeabilized (DPBS, goat serum, and triton-x).Cells were incubated in blocking buffer (DPBS and goat serum) with primary antibodies overnight at 4 C.After 24-hours, cells were washed with DPBS and incubated with secondary antibodies in DPBS for 30 minutes.They were then washed again prior to adding NucBlue Fixed stain ready probes.Coverslips were removed and mounted on glass slides using EverBright Hardset Mounting Medium.For PGP1 cell lines, staining was done the same as the ATCC cell lines, except they were plated on Poly-D-Lysine in Ibidi 8-well chamber slides and allowed to adhere at 37 C for 2 hours prior to fixing.For each cell line imaging was done as follows.
ATCC: Fluorescent images for Figure S5 A were taken with the Nicon Eclipse microscope with NIS Elements software (version 5.11.01) and exported at a tiffs.Bright field images were taken directly in culture plates using the EVOS microscope.The final panel of images were compiled in Power Point.
PGP1 wild-type and edited: Confocal Z-stacks for Figure S5 B were collected using a Zeiss LSM 880 equipped with an Axio Observer 7 inverted microscope body and acquired with Zen Black (version 2.3) software using 405nm diode, 488nm argon ion, and 561nm DPSS laser lines.Emitted light was detected through a Zeiss Plan-apochromat 63x/1.4NA oil immersion objective, using an Airyscan GaAsP detector.Images were collected sequentially in 1024x1024 pixel resolution, using 0.25um z-steps.Images were acquired with an optical zoom of either 1.0 or 2.0, and individual voxels were therefore 0.13x0.13x0.25umor 0.07x0.07x0.25um(xyz), respectively.
To create Figure S5B, raw czi images were opened in Fiji ImageJ (v1.54f).An average intensity z-projection was generated for each image, inclusive of all channels and z, and saved as a tiff.Any image larger than 1012x1012 pixels were cropped for uniformity.All six average intensity projection images were then concatenated for easier import into the ImageJ plugin QuickFigures 59 to assemble into the final figure shown.

ATAC-seq
The biostatistics core at VARI conducted a power calculation based on ATAC-seq effect size to determine the number of optimal replicates.With three replicates and an average depth of $40 reads per million, this study has >80% power to detect, with 95% confidence, peaks with $1.8 fold or greater difference in accessibility and $99% power for a $2-fold difference in accessibility.This calculation was done for the purpose of performing a differential accessibility analysis between wild-type and edited microglia in future experiments.We believe the 4 replicates that we generated in combination with the 13 published ATAC-seq data is sufficient to detect robust peaks in microglia.
Microglia were thawed and cultured for at least one week prior to an ATAC-seq experiment.Samples that yielded the best fragmentation started from a total of 10K, 31K, and 100K cells.The pre-specified number of cells were aliquoted into 1.5 mL tubes and centrifuged at 400 x g for 7 minutes.The supernatant was removed, and the cells were washed once with 50 ml ice-cold PBS.The cells were then resuspended in icecold Lysis Buffer containing resuspension buffer (1M Tris-HCl pH 7.5 (final conc.= 10mM), 5 M NaCl (final conc.= 10 mM), 1M MgCl 2 (final conc.= 3 mM), and nuclease-free H 2 O), 10% NP-40 (final conc.= 0.1% v/v), 10% Tween-20 (final conc.= 0.1% v/v), and 1% Digitonin (final conc.= 0.01% v/v).Cells were then incubated on ice for 3 minutes.One mL of wash buffer (990 ml resuspension buffer + 10 ml Tween-20 (final conc.= 0.01% v/v)) was added to each tube.The tubes were then inverted 3X gently and centrifuged at 500 x g for 5 minutes.For each sample, 10 ml of transposition mix (7.5 ml 2X TD Buffer, 2.05 ml 1X PBS, 0.15 ml 10% Tween-20 (final conc.= 0.1 v/v), 1% Digitonin (final conc.= 0.01% v/v), and 0.15 nuclease-free H 2 O) was added.Five ml of ATM was then added separately to each sample.The samples were incubated for 60 minutes on a thermomixer at 1,000 rpm.Following incubation, the samples were placed on ice, and 5 ml of NT buffer was added to each tube to neutralize the tagmentation reaction.Tubes were then centrifuged at 300 x g at 20 C for 1 minute and incubated at room temp for 5 minutes.DNA purification was done using the Zymo clean and concentrator kit.
For library generation, 5 ml of Illumina Nextera DNA unique Dual Indexes plus 25 ml NEBNext High-Fidelity 2X PCR Master Mix was added to 20 ml of purified transposed DNA.The transposed fragments were amplified starting at 72 C for 5 minutes, 98 C for 30 seconds and then five cycles of 98 C for 10 seconds, 63 C for 30 seconds, and 72 C for 1 minute.qPCR was used to determine how many additional cycles to run on each sample.The PCR mix was composed of 5 ml of the partially amplified library from the previous step, 0.5 ml Illumina primer 1 (25 mM), 0.5 ml Illumina primer 2 (25 mM) 0.75 ml 20X Eva Green, and 5 ml NEBNext High-Fidelity 2X PCR Master Mix.Cycle conditions were set to 98 C for 30 seconds, and 20 cycles of 98 C for 10 seconds, 63 C for 30 seconds, and 72 C for 1 minute.The R vs. cycle number was plotted on a linear scale.Additional cycles were calculated by determining the number of cycles needed to reach 1/3 of the maximum R. PCR was continued on the remaining partially amplified libraries for the appropriate number of cycles calculated in the previous step.

Sequencing of ATAC-seq libraries
Library quantification, size selection, and sequencing were carried out by the Genomics Core at VAI. PCR amplified libraries were size selected for fragments 200-800 bp in length using double-sided SPRI selection (0.5x followed by 1x) with KAPA Pure beads.The quality and quantity of the finished libraries were assessed using a combination of Agilent DNA High Sensitivity chip (Agilent Technologies, Inc.), and QuantiFluorâ dsDNA System.Seventy five base pair, paired-end sequencing was performed on an Illumina NextSeq 500 sequencer using a 150 bp sequencing kit (v2) to produce a minimum of 50M paired-reads per library.Base-calling was done by Illumina NextSeq Control Software (NCS) v2.0, and the output of NCS was demultiplexed and converted to FastQ format with Illumina Bcl2fastq v1.9.0.
that had a quality score below 20 and were less than 20 base pairs in length were removed using Trimgalore (see key resources table for citation).Reads were then aligned with STAR 70 and quality checked using MulitQC. 61quencing of total RNA-seq libraries Libraries were prepared by the Van Andel Genomics Core from 10 ng of total RNA using Takara SMARTer Stranded Total RNA-Seq Kit v3 Pico Input Mammalian per the manufacturer's protocol.In brief, RNA was sheared to 300-400 bp, after which dscDNA was generated using a template switching mechanism, and unique dual indexed adapters were added to each sample.Ribosomal cDNA was degraded by scZapR and scrRNA probes, and libraries amplified with 13 cycles of PCR.Quality and quantity of the finished libraries were assessed using a combination of Agilent DNA High Sensitivity chip, QuantiFluorâ dsDNA System, and Kapa Illumina Library Quantification qPCR assays.Individually indexed libraries were pooled and 100 bp, paired-end sequencing was performed on an Illumina NovaSeq6000 sequencer, to return a minimum read depth of 50M read pairs per library.Base calling was done by Illumina RTA3 and output of NCS was demultiplexed and converted to FastQ format with Illumina Bcl2fastq v1.9.0

QUANTIFICATION AND STATISTICAL ANALYSIS
Statistical analysis for differential gene expression in microglia was done using edgeR version 3.18. 71Comparisons were made by grouping all technical and biological replicates of the edited microglia (A11 and H11, n = 4) and all replicates of the wild-type microglia (WT, n = 3), with a batch correction to account for separate differentiations.For a detailed account of biological and technical replicates see Figure S4B.Using edgeR, we first performed TMM normalization of libraries.Significant differences were then determined using the genewise negative binomial generalized linear model (glmQLFit).Expression levels were considered statistically significant if the FDR value was % 0.05.EdgeR estimates dispersion from replicates using the quantile-adjusted conditional maximum likelihood method (qCML).Details of the statistical analysis can be found in the Figure 3 legend.The individual samples compared can be found in Figure 3B.

Figure 1 .
Figure 1.A ''risk haplotype'' resides in an intragenic SNCA enhancer (A) Correlation analysis of the alleles of rs2737004 and rs2619356 using NCBI's LDlink tools.The LDhap results show the frequency of each allele of the SNPs individually in all populations from NCBI.The colored boxes represent the haplotypes observed in the same population, with their counts and frequencies displayed below.The LDpair analysis reports the calculated statistics for linkage disequilibrium (D 0 and R 2 ) and the ''goodness-of-fit'' (chi-square and p value), which indicates the degree that the observed haplotype frequencies deviate from the expected allele frequencies.(B) Genome browser view of microglia ATAC-and H3K27ac ChIP-seq signals plotted with the locations of PD risk SNPs, all SNPs from 1,000 genomes, and the SNPs that are in LD with rs2737004.Rs2619356 was the only other SNP to overlap the ATAC-seq peak.Dotted lines represent the location of CRISPR/Cas9 guides designed to delete the 439 base pair region encompassing both SNPs in the ''risk haplotype.''.

Figure 2 .
Figure 2. The SNCA risk enhancer shows evidence of functionality via 3D chromatin interactions and transcription factor binding (A) H3K27ac ChIP-seq tracks are displayed for microglia (purple) and neurons (green).Below are tracks showing PLAC-seq data for the same cell types.The red ovals denote primary interaction sites, in microglia, of the risk enhancer where rs2727004 is located.(B) MotifbreakR results showing transcription factor (TF) binding motifs of the TFs that have preference for the alleles of the risk haplotype (G for rs2737004 (left) and T for rs2619356 (right)).The letter size represents the results of the positional weight matrix that measures the frequency that the transcription factor binds to that nucleotide.In that same plot, the dashed black box demarcates the position of the SNP.The light blue boxes below represent the positions of the transcription factor binding motifs relative to the SNP's genomic position, demarcated with the red box.(C) Remap ChIP-seq data for the transcription factors displayed in part B. The red lines show the position of each SNP within the ChIP-seq peak.

Figure 3 .
Figure 3.The SNCA risk enhancer controls the expression of SNCA, MMRN1, and a network of additional genes in microglia (A) EdgeR glmFit results comparing wild-type to SNCA-deletion cell lines across three time points in differentiation.Dotted lines represent a Log2 fold change of 2 and a false discover rate (FDR (Benjamini-Hochberg)) of 0.05.Gene name labels were manually added for consistency and clarity.See also TableS3for results.(B).Heatmap of Z scores calculated from TMM-normalized log2 counts per million (log2 CPM).The genes in the heatmap represent those that have a Benjamini-Hochberg FDR cutoff of less than 0.1 (less stringent than used for the volcano plots in part A), and a fold change (FC) of over 1.4 (log FC > 0.5).The final plot was made by ranking the genes by log FC and taking the 15 genes at the top and bottom of that list.

Table 1 .
Top candidate risk SNPs that overlap active regulatory DNA

TABLE
d RESOURCE AVAILABILITY B Lead contact B Materials availability B Data and code availability d EXPERIMENTAL MODELS AND STUDY PARTICIPANT DETAILS B iPSC cell lines d METHODS DETAILS B iPSC differentiation to HPCs B Flow analysis of HPC markers B Differentiation of CD43 + HPCs to microglia B Immunocytochemistry and imaging microglia B ATAC-seq B Sequencing of ATAC-seq libraries B Identification of ATAC-seq peaks B PD risk SNPs ATAC-seq peak intersect