Integrative genomics of microglia implicates DLG4 (PSD95) in the white matter development of preterm infants

Preterm birth places infants in an adverse environment that leads to abnormal brain development and cerebral injury through a poorly understood mechanism known to involve neuroinflammation. In this study, we integrate human and mouse molecular and neuroimaging data to investigate the role of microglia in preterm white matter damage. Using a mouse model where encephalopathy of prematurity is induced by systemic interleukin-1β administration, we undertake gene network analysis of the microglial transcriptomic response to injury, extend this by analysis of protein-protein interactions, transcription factors and human brain gene expression, and translate findings to living infants using imaging genomics. We show that DLG4 (PSD95) protein is synthesised by microglia in immature mouse and human, developmentally regulated, and modulated by inflammation; DLG4 is a hub protein in the microglial inflammatory response; and genetic variation in DLG4 is associated with structural differences in the preterm infant brain. DLG4 is thus apparently involved in brain development and impacts inter-individual susceptibility to injury after preterm birth.


DLG4/Lamp1
Supplementary Figure 14 DLG4 (PSD95) expressed by microglial cells is not localized in lysosomal compartments labelled by LAMP1. In tissue sections from P1 mouse, microglial cells expressing IBA1 (blue panels) from PBS or IL1B treated animals (P1), DLG4 (PSD95) immunoreactivity (red panels) is predominantly located at the surface of cell bodies and ramifications. Conversely, LAMP1 immunofluorescence (green panels) is mainly confined to intracellular vesicles. No colocalisation is thus observed between DLG4 and LAMP1 (DLG4/LAMP-1 panels). Scale bar 5 µm.  Expression of DLG4 gene from adult human brain white matter whole tissue (UKBEC repository).

Intersec(ons between SPN1-2 and neuropsychiatry disease modules
gene set in test gene set not in test

Gene Name EntrezGene Ensembl
where N is the number of edges between the neighbours of n, and M is the maximum number of edges that could possibly exist between the neighbours of n.
Characteristic path length: expected distance between two connected nodes.
Average number of neighbours: average connectivity of a node in the network (the size of its neighbourhood)

No. of nodes: number of genes in the network
Node degree exponent: Scale-free networks are characterized by a power-law degree distribution; the probability that a node has k links follows P(k) k -, where is the degree exponent.
Power law R squared: the proportion of variability in a data set, which is explained by a fitted linear model. Therefore, the R-squared value is computed on logarithmized data, where the power-law curve: y = β x α is transformed into linear model: ln y = ln β + α ln x

Animal model
Experimental protocols were approved by the institutional guidelines of the Institut National

Fluorescence activated cell sorting in mouse
Dissociated cells from the cerebrum of mice pups (P1, P3, P5, P10) were centrifuged on a Percoll gradient, as previously described 5 . This protocol uses a digestion cocktail containing collagenase and dispase, and it involves separation over discontinuous percoll gradients.
Cells were stained using different markers of myeloid cells. Neutrophils were defined as CD11B hi LY6G hi whereas monocytes, macrophages and microglia are defined as

CD11B+ microglia magnetic activated cell sorting in mouse
Brains were collected from mice for cell dissociation, and microglia were isolated by magnetic antibody-based cell sorting (MACS) using CD11B antibody according to the manufacturer's protocol using all recommended reagents and equipment (Miltenyi Biotec, Bergisch Gladbach, Germany) and as previously described 6 . In brief, mice were intracardially perfused with NaCl

Blood brain barrier analysis in rat
The integrity of the blood brain barrier (BBB) was assessed following exposure to either 36

hours (3 injections) or 5 days (9 injections) of twice-daily intraperitoneal injections of IL1B
(20µg/kg/injection) in the rat. The phenotype of the injury following IL1B mimics that observed in the mouse and was assessed as previously described, via gene expression analysis of the total brain and IgG staining of sections of the cerebral cortex [6][7][8] . Eight animals per group were used for the analysis. For gene expression analysis, pups were sacrificed by decapitation at P2 or P5, 5h after the last injection of IL1B. The brain was removed and the cerebrum was harvested and immediately frozen in liquid nitrogen. Total RNA was isolated using the Rneasy Mini Kit (Qiagen, Courtaboeuf, France) and qPCR performed as previously described 6 (Methods). Genes validated for use as indicators of BBB breakdown were studied 7 and primers for each gene measured are listed in Supplementary Table 13. Data are reported as relative to the reference gene, Gapdh. For the analysis of IgG staining, rat pups were killed by decapitation at P5, 5h after the last injection of IL1B. The brains were fixed immediately in 4% formalin and post-fixed for 5 days. Following processing and paraffin embedding, sections were prepared at 10µm and immunolabelling with the antibody anti rat IgG (Sigma, B7139) was performed using the streptavidin-biotin-peroxidase method, as previously described 8 .

Microarrays of mouse microglia gene expression and data pre processing
RNA was extracted and hybridised to Agilent Whole Mouse Genome Oligo Microarrays (8x60K). One biological replicate of IL1B exposure at P1 was removed from the analysis due to low RNA integrity, leaving a total number of 47 samples. Background-corrected log 2 intensity data were quantile normalised and assigned detection p-values based on level of intensity and proximity to baseline. Genes with an expression p-value <0.05 across all samples were retained for analysis, resulting in a subset of roughly 23000 genes. Multivariate normality was confirmed with a test for kurtosis implemented in the R package ICS 9 .
Assessment of variance of the data by multi-dimensional scaling (limma package in R 10 ) confirmed similar distribution of samples across the IL1B and PBS groups.

Gene co-expression network reconstruction from mouse microglia
Gene co-expression networks were inferred individually for the Development, IL1B and Interaction responses, using Graphical Gaussian Models implemented within the R software package "GeneNet" 14,15 . This computes partial correlations, which are a measure of conditional independence between two genes i.e. the correlation between two genes after the common effects of all other genes are removed. Three separate gene networks were built from the sets of genes identified by MANOVA to show a significant response to IL1B, Development and Interaction effect (Figure 2). Local FDR was set at 1e -13 (the most stringent threshold possible, to minimise network size) and a minimum edge-wise partial correlation was set at 0.0075 (the point from which partial correlations increased exponentially across all networks). Hiveplots 16 were used to illustrate topological differences between the networks; the axes relate to ranges of node degree and nodes are plotted on each axis according to their degree. An illustrative hiveplot is in Figure 2, with an animation displaying the entire range in Supplementary Video 1, and parameters in Supplementary Table 14.

Protein-protein interactions and Power Graph Analysis
The nodes of all three gene networks (IL1B, Development, and Interaction) were aggregated into one list and used to investigate protein interactions, with both the Netvenn 17, 18 and DAPPLE 19 tools. The Netvenn tool was also used to perform a Power Graph Analysis (PGA) 20 on the protein interaction network, and two super-powernodes (SPNs) were identified by examining which sets of powernodes were directly interconnected (Figure 3, Supplementary Figure 6).

Functional annotation of gene co-expression and PPI networks
Gene Ontology annotation and enrichment analysis was carried out with the WebGestalt platform 21 , always using as background the list from which the current set of interest was drawn to avoid inflation of significance e.g. all significantly expressed genes as background for MANOVA genes, and MANOVA genes as background for gene networks.
The REVIGO tool 22 was used to summarise GO terms based on semantic similarity measures, using an algorithm akin to hierarchical (agglomerative) clustering. Highly semantically similar GO terms were grouped as guided by the p-values supplied alongside the GO terms. This non-redundant GO term set was visualised using multidimensional scaling to render the subdivisions and the semantic relationships in the data. The disease-gene links used by the Gene Disease Association tool (GDA) 23 are assembled from the Genopedia compendium in the HuGE database of Human Genetic Epidemiology 24 and the OMIM database 25 , which are collections of data retrieved from biomedical literature and do not provide cell or tissue specific annotations. While these disease-gene associations are not tissue specific, the relevance and usefulness of protein-protein interaction modules to tissue or cell type specific transcriptional programs has been previously shown by our group and others [26][27][28] .

Transcription factor analysis
The transcriptional control of the protein networks was interrogated in several independent ways. Transcription factor affinities for promoter sequences in the genes coding for the proteins of interest were predicted using the PASTAA tool 29 . Transcription binding sites for STAT1, STAT3 and STAT5 TFs were identified from experimental ChIP-Chip data of microglia in a P1 rat model with LPS exposure (peaks with FDR <0.2) 30 . Correlation of transcription factor expression levels and potential targets was calculated by finding the linear correlation between each transcription factor and each gene individually.

Primary ex vivo mouse microglia isolated by MACS
Primary microglia were prepared from the brain of P1 mice pups. Brain tissues were dissociated using the Neural Tissue Dissociation Kit containing papain and the gentleMACS Octo Dissociator with Heaters. Microglia were isolated using anti-CD11B

Treatment of mouse microglia with inhibitors for DLG4 and STAT3
Two days after plating, microglia were treated for 6 to 12 hours with vehicle (DMSO, 10µL/ml of culture media), IL1B at 50 ng/mL + IFNg at 20ng/ml, IL1B at 50 ng/mL + IFNg and 20ng/ml + Bp-1-102 (a STAT3 inhibitor; Merck Millipore, Fontenay sous Bois, France) 31 at 15 or 30 µM, or IL1B at 50 ng/mL + IFNg at 20ng/ml + TAT-N-dimer (an inhibitor of the DLG4 protein NMDA receptor interaction; Merck Millipore) at 15 or 30nM. Bp-1-102 and TAT-N-dimer were added one hour before the cytokines. At the end of the treatment period, cells were harvested and mRNA extracted for gene expression analysis, and supernatant were collected for nitrites/nitrates or cytokines/chemokines measurement.

RNA extraction and quantification of gene expression by real-time qPCR
Total RNA from primary microglial cell cultures was extracted with the RNeasy mini kit according to the manufacturer's instructions (Qiagen, Courtaboeuf, France). RNA quality and concentration were assessed by spectrophotometry with the Nanodrop™ apparatus (Thermoscientific, Wilmington, DE, USA). Total RNA (1-2µg) was subjected to reverse transcription using the iScript™ cDNA synthesis kit (Bio-Rad, Marnes-la-Coquette, France).
RT-qPCR was performed in duplicate for each sample using SYBR Green Supermix (Bio-Rad) for 40 cycles with a 2-step program (5 seconds of denaturation at 96°C and 10 seconds of annealing at 60°C). Amplification specificity was assessed with a melting curve analysis.
Primers were designed using Primer3 software, and sequences and their NCBI references are given in Supplementary Table 13. The relative expression of genes of interest (GOI) was determined relative to expression of the reference gene, Glyceraldehyde 3-phosphate dehydrogenase (GAPDH). Analyses were performed with the Biorad CFX manager 2.1 software.

Multiplex cytokine and chemokine assay
Microglia media harvested at the end of the treatment was centrifuged briefly to remove particulates (300xg for 10 minutes). Cytokine and chemokine levels in the microglial media

Nitrites and nitrates assay
Microglia media harvested at the end of the treatment was centrifuged briefly to remove particulates (300xg for 10 minutes). Nitrite/nitrate content was measured using the nitrate/nitrite colorometric assay kit (Cayman Chemical, Ann Arbor, MI, USA) as directed.

Phagocytosis assay
Phagocytosis of fluorescently labelled E Coli particles by microglia was assessed using the pHrodo Red E. coli BioParticles Conjugate (Life Technologies) according to the manufacture's instructions. In brief, 50,000 primary microglial prepared as described above were plated in 48 well plates and after 12 hours of incubation with IL1B at 50 ng/mL + IFNat 20ng/ml in the presence or absence of TAT-N-dimer at 30nM, medium was changed to serum free media containing the recommended suspension of bioparticles. Cells were incubated for five hours, before the particle-containing media was removed, washed twice with serum free media, incubated with a solution of trypan blue for 1 min to quench extracellular fluorescence and 1 ml of serum containing media added to each well. The absorbance of each well (including cell free, bead free and media free controls) was read. To adjust for cell density, following reading of the plate, the cells were used in an MTT assay.

Immunohistochemistry of mouse brain sections and isolated cells
Male mice (OF1 strain; Charles River) subjected to the IL1B induced white matter injury

Human brain gene expression
Data on DLG4 gene expression in the developing brain were accessed from the BrainCloud resource, which includes 30,176 probes on 269 samples across the lifespan (fetal through the aged) 34 and plotted (Supplementary Figure 15, panel a).
Developing Transcriptome data as described in the technical white paper were downloaded for DLG4 35 and plotted (Supplementary Figure 15, panel b). Expression of DLG4 in the adult brain was visualized in the Allen Brain Atlas Brain Explorer software application (http://human.brain-map.org/static/brainexplorer), in which samples from the cerebral cortex are overlaid on an inflated white matter surfaces for each donor's brain, while samples in the subcortical regions of the brain are represented as spheres below the inflated surfaces.
Paired gene expression and genotype data from human brain were queried in the UKBEC collection of 134 brains from individuals free of neurodegenerative disorders 36 . Regional expression of DLG4 (probeset mean) was extracted for all areas including white matter (Supplementary Figure 8, panel c). Genotypes for rs17203281 in all individuals were also accessed.

MACS isolation and inflammatory activation of human CD11B+ microglia
All human post-mortem tissue (cells and tissues) was acquired with ethical approval at The (ab5076) and DLG4 (MA1-045) was performed as for mouse tissues above.

Immunohistochemistry of ex vivo human microglia and brain sections
For the collection of human microglia, post-mortem tissue without any neuropathological

Image Acquisition
Cohort 1: Imaging was performed on a Philips 3

Imaging data selection and quality control
The T2-weighted MRI anatomical scans were reviewed in order to exclude subjects with extensive brain abnormalities, major focal destructive parenchymal lesions, multiple punctate white matter lesions or white matter cysts, since these infants represent a heterogeneous minority (1-3%) with different underlying biology and clinical features to the general preterm population [37][38][39][40] . All MR-images were assessed for the presence of image artefacts (inferiortemporal signal dropout, aliasing, field inhomogeneity, etc.) and severe motion (for headmotion criteria see below). All exclusion criteria were designed so as not to bias the study but preserve the full spectrum of clinical heterogeneity typical of a preterm born population.
Diffusion tensor imaging (DTI) analysis was performed using FMRIB's Diffusion Toolbox (FDT v2.0) as implemented in FMRIB's Software Library (FSL v4.1.5; www.fmrib.ox.ac.uk/fsl) 41 . Each infant's diffusion weighted images were registered to their non-diffusion weighted (b0) image and corrected for differences in spatial distortion due to eddy currents. Non-brain tissue was removed using the brain extraction tool (BET) 42 . FA maps were constructed from 15 or 32 direction DTI, and Tract Based Spatial Statistics 43 was used to obtain a group white matter skeleton by using a modified pipeline specifically optimized for neonatal DTI analysis 44 . Diffusion tensors were calculated voxel wise, using a simple least squares fit of the tensor model to the diffusion data. From this, the tensor eigenvalues and FA maps were calculated, and thresholded at FA > 0.2, then linearly adjusted for PMA at scan and GA.

Infant genotyping
Saliva samples for both infant cohorts were genotyped on the Illumina HumanOmniExpress-12 array, as previously described 45  All samples successfully passed quality control.

Association of imaging features with genotype
A general linear model was applied in FSL to test for association between FA values and minor allele count in Cohort 1, and significance was assessed using the randomise tool for nonparametric permutation inference on neuroimaging data 47 with threshold-free cluster enhancement (TFCE) inference 48 ( Figure 6). This procedure was repeated independently for Cohort 2. There was no significant difference in clinical features (GA, PMA or days of ventilation) between infants with or without the minor allele in either Cohort (Supplementary   Table 15).

DLG4 expression quantitative trait loci analysis
UKBEC genotype and expression data as described above were downloaded for DLG4 white matter samples and SNP rs17203281. We used the transcript-level expression profile provided in the UKBEC dataset, which is estimated as the Windsorized mean (similar to trimmed mean) of the exon-level probesets that are considered expressed above background noise. It is chosen because it is robust to statistical outliers that may arise from alternative splicing.
Individuals were grouped according to minor allele count 0,1,2 or presence/absence of the minor allele (A) and outliers in the expression value were removed following thresholding by the boxplot.stats function in R. Significant group differences in expression values for the transcription-level were identified using a Student's t-test (Supplementary Figure 17).