IRF5:RelA Interaction Targets Inflammatory Genes in Macrophages

Summary Interferon Regulatory Factor 5 (IRF5) plays a major role in setting up an inflammatory macrophage phenotype, but the molecular basis of its transcriptional activity is not fully understood. In this study, we conduct a comprehensive genome-wide analysis of IRF5 recruitment in macrophages stimulated with bacterial lipopolysaccharide and discover that IRF5 binds to regulatory elements of highly transcribed genes. Analysis of protein:DNA microarrays demonstrates that IRF5 recognizes the canonical IRF-binding (interferon-stimulated response element [ISRE]) motif in vitro. However, IRF5 binding in vivo appears to rely on its interactions with other proteins. IRF5 binds to a noncanonical composite PU.1:ISRE motif, and its recruitment is aided by RelA. Global gene expression analysis in macrophages deficient in IRF5 and RelA highlights the direct role of the RelA:IRF5 cistrome in regulation of a subset of key inflammatory genes. We map the RelA:IRF5 interaction domain and suggest that interfering with it would offer selective targeting of macrophage inflammatory activities.

Data were analyzed using ABI 7900HT machine (Applied Biosystems, USA). All primer sets were tested for specificity and equal efficiency before use.

Protein Binding Microarrays (PBMs)
Sequences of the different primers and DNA ligands can be found in Additional file "primers_oligos_IRF". All quantification of nucleic acid samples was performed according to manufacturer instructions on a Qubit Fluorometer (Invitrogen #Q32857, Paisley, United Kingdom) and with either the Quant-iT dsDNA High Sensitivity Assay Kit (Invitrogen #Q33120) or the Quant-iT dsDNA Broad Range Assay Kit (Invitrogen #Q33130). Protein assays were performed using the Quant-iT™ Protein Assay Kit (Invitrogen #Q33210).

Protein expression and purification
Expression constructs for the IRF proteins (Homo sapiens) used in this study were created following a set of procedures previously established by Udalova and co-workers [42]. Briefly, pET vectors for expression in BL21 (DE3) Escherichia coli (Merck, Nottingham, United Kingdom) were used to produce histidine-tagged (His-tagged) recombinant proteins. Proteins were overexpressed through induction with 0.2 mM isopropyl β-D-1thiogalactopyranoside (IPTG) at 30°C for 5 hours. Pellets of cells were harvested in 'Ni-NTA binding' buffer with added EDTA-free protease inhibitor (Roche, West Sussex, United Kingdom), pulse-sonicated for 2 minutes and debris removed via centrifugation at 16,000 g. A two-step purification procedure was then employed, first with the 'Ni-NTA His-Bind Resin' system (Merck #70666) and then a subsequent purification based on DNA-affinity isolation of functional, DNA-binding protein. Ni-NTA purification was carried out according to the manufacturer's guidelines. For DNA-affinity isolation, the processing of a sample derived from 250 ml of bacteria culture required 0.128 µM of oligonucleotides specific for IRF protein binding. Prior to use, the oligonucleotides were annealed via incubation in NEB Buffer 3 at 94°C for 1 minute then subsequently for an additional 69 cycles of 1 minute each coupled to a per-cycle, step-wise decrease of 1°C. A pre-annealed oligo mixture (712.5 µl) was conjugated with streptavidin-agarose (Sigma, Dorset, United Kingdom) before once-purified material from the preceding step was added to it.

ChIP-Seq analysis
Reads were mapped onto mouse genome build 37 by NCBI and the Mouse Genome Consortium (Church et al., 2009) , downloaded from UCSC (Fujita et al., 2011), mm9) using bowtie 0.12.7 (Langmead et al., 2009)) with the following options:-n 2 -a --best --strata -m 1. Peaks were called with Zinba (version 2.02.01, (Rashid et al., 2011)) using default options, a window size of 200 and an FDR of 1%. Aligned reads and called peaks were visualized with IGV ( Thorvaldsdottir et al., 2013). After visualization we noticed that abundant peaks in the 3' region of genes caused binding events in the 5' region of genes to go undetected. For the promoter analysis only we thus called peaks using MACS2 in a region of -10kb, 1kb around transcription start sites only. This added an additional 939 peaks in the Irf5 data set. Read densities were analyzed with in-house scripts.

Interaction analysis and genomic enrichment
The significance of genomic enrichment was analysed using a simulation procedure similar to (Ponjavic et al., 2009). Briefly, the genomic association between a test set of peaks and a genomic annotation is measured by randomly simulating sets of peaks of equivalent size and length distribution to the test set. Enrichment and depletion are measured as ratio of the observed nucleotide overlap compared to the expected nucleotide overlap from 10,000 simulated sets and its significance is expressed as a P-Value. The significance of fold change difference is computed in an analogous manner by combining the results from two parallel simulations. Genomic regions of low mapability are excluded from the simulation. To control for biases in gene density, the overlap with chromatin marks was assessed in 50kb regions around genes only. For the overlap with transcription factors only regions 2kb upstream and 0.5kb downstream of transcription start sites were considered. The code for the simulations is publicly available (http://code.google.com/p/genomicassociation-tester/).

Motif analysis
Motif analysis in ChIP-Seq peaks was performed using MEME-ChIP(Machanick and Bailey, 2011). Motif discovery was performed on the top 500 peaks using MEME-ChIP in 200 bp windows around the position with highest read density in a ChIP-Seq peak. Motif discovery used both repeat masked and unmasked sequence using the following options: "-dna -revcomp -mod anr -nmotifs 3minw 5 -maxw 30".

Protein binding microarrays
We designed 2 × 105K Agilent arrays using eArray (details given below). These arrays were comprised of two main sets of probes: 12-mer sequences designed for IRF binding and a set of 11-mer sequences design for NF-kB binding use for validation purposes. As an IRF consensus sequence, we used the motif NRWANNGARAVY that codes for a total of 3072 different motifs.
Experiments were carried out in technical replicates showing a 98% correlation. Z-scores were assigned to each sequence represented in the array. The sequences were ranked and used to produce binding motifs over all 12bp using weblogo (http://weblogo.berkeley.edu/ )

Microarrays (PBMs)
Description of probe-design on the microarrays. Our microarrays are chips of 2 arrays each with 104961 probes per array. Each array contains 1325 manufacturer-probes (Agilent) and 103636 customized probes. Each probe is represented using 4 different flanks of 4-nt length: AGCT, ATGA, AGTC, AGAT and each flanked probe is replicated 7 times. Additional "IRF_design_microarray.txt" shows a breakdown of the number and type of probes present on each array.

Protocol for generation and use of double-stranded protein microarrays.
Single stranded probes on each array were rendered double-stranded with the following procedure. For each array on a 2x150K chip, 820 μl of "ds-mix" (NEB buffer 2, 0.1 μM dsPrimer, 2.5 X BSA, 163 μM dNTPs, 1.63 μM of Cy3-dCTP and 27.2 U of Klenow DNA polymerase I) was dispensed onto a "1x205K gasket", combined with a chip, the entire unit sealed within a hybridization chamber and incubated within a rotating-oven at 37 ºC for 90 min. The following washes were then carried out: 6 washes in 0.01 % Triton-X/PBS for 3 min each followed by a 3 min wash in PBS. Arrays were dried via centrifugation. To ascertain overall success of the procedure, arrays were scanned using the Agilent Microarray Scanner at maximum power and the image analyzed for extent of Cy3-incorporation within individual probes. Prior to hybridisation, arrays were blocked, washed according to manufacturer's guidelines and incubated in 2 % milk/PBS for 1 h at room temperature. This was followed by 2 washes (6 min each; 0.1% Tween-20/PBS followed by 0.01% Triton X-100/PBS) and ended with a brief rinse in water before drying via centrifugation. Hybridizations were performed using a protein concentration of 0.01 μg/μl in 420 μl of protein binding reaction mix (10 mM HEPES pH 8, 0.5 M NH4OAC, 100 mM NaCl, 5 mM MgCl2/MgAcetate, 1 mM DTT and 5% glycerol). Protein binding reaction mixes were dispensed into the different compartments of a 2x105K gasket slide (Agilent), combined with a chip and the entire unit sealed into a hybridization chamber. The assembled unit was rotated in the hybridization oven for 1 h at room temperature. Arrays were then subsequently washed 6 times with 1 % Tween-20/PBS for 6 min each and a further 6 washes with 0.01 % Triton X-100/PBS for 6 min each. This was followed by a brief rinse in water and drying via centrifugation. Labelling of bound protein was carried out in two stages. Firstly, arrays were incubated with 0.8 μg of primary rabbit anti-His antibody (Santa Cruz) in a 2 % milk/PBS solution for 1 h at room temperature. This was followed by 6 washes with 0.05 % Tween-20/PBS for 3 min each and other 6 washes with 0.01 % Triton X-100/PBS for 3 min. Subsequently, arrays were incubated with 6 μg of secondary Cy5-conjugated anti-rabbit IgG antibody in a 2 % milk/PBS solution for 30 min at 37 °C before being washed as per above. Before drying, arrays were first rinsed in PBS for 6 mins and then briefly again in water. Arrays were dried via centrifugation and scanned using the Agilent Microarray Scanner at maximum power.

Genotyping
The IRF5 -/line was genotyped for the DOCK2 mutation as described previously (Yasuda et al., 2013). Briefly, DNA was obtained from ear clips using REDExtract-N-Amp (Sigma) and PCR was performed using the following primers which detect the DOCK2 mutation as a 305-bp product: DOCK2In29.4F GAC CTT ATG AGG TGG AAC CAC AAC C; DOCK2InR22.3.1R GAT CCA AAG ATT CCC TAC AGC TCC AC. IRF5 mice possessing the mutation for DOCK2 were culled and all experiments were performed on a line that was wild-type for DOCK2.

Accession numbers:
Unprocessed data have been deposited at ArrayExpress under accession numbers E-MTAB-2031 (ChIP-Seq data), E-MTAB-2661 (ChIP-Seq data) and E-MTAB-2032 (microarray data) SUPPLEMENTAL FIGURES Figure S1: (A) Cell surface receptor and cytokine expression in macrophages BMDMs differentiated with GM-CSF or M-CSF. FACS samples were collected at day 9 of differentiation and stained for F4/80, CD206 or MHCII, IRF5 after the cells were stimulated with LPS (100ng/mL; 4hrs) or left unstimulated. Data are representative of 6 experiments. (B) Scatter plots of RelA ChIP-seq peaks following LPS stimulation at 0.5 or 2hr with unstimulated condition. (C) Scatter plots of IRF5 ChIP-seq peaks following LPS stimulation at 0.5 or 2hr with unstimulated condition. (D) Specific Recruitment of IRF5 to example gene promoters (Il6,ccl5,il12b,il1a,tnf,nfkbia,gadd45b,irf1,il12a and mllt6) were analysed by qPCR in GM-BMDMs from either WT or IRF5 KO mice following LPS stimulation (100 ng/mL; 2hrs). No recruitment of IRF5 was observed on the negative control Hbb promoter. Data show mean percentage input relative to genomic DNA (gDNA) plus or minus SD of a representative experiment.     Approximately 70% of RelA protein was degraded following Cre infection compared to empty control as analysed by Western blotting of RelA. IRF5 protein stability was unaffected following Cre infection SUPPLEMENTARY TABLES Table S1: MappingChIP-seq reads to genome: (A) A total of 6 samples were analysed for ChIP-seq following LPS stimulation (100ng/mL) as indicated. For each sample the total number of reads sequenced (total) and the total reads mapped (mapped) are shown. Reads mapping to the exact same position (duplicates) were removed before peak calling. (B) The genome was segmented into annotated regions (cds, utr, upstream, downstream, intronic, intergenic) based on the ENSEMBL gene set. To avoid over-counting, an interval is associated with an annotation depending on the location of the peak (the point with the highest read density within an interval) (C) and (D) To assess whether IRF5 (C) and RelA (D) intervals are significantly associated with functional genome annotations (described in Figure1b), a simulation procedure was applied (see Methods). Observed (Observed nucleotide overlap between IRF5/RelA intervals and a genomic region); Expected (Expected nucleotide overlap between IRF5/RelA intervals and a genomic region based on simulations); CI25low/CI95high (95% confidence intervals); Stddev (Standard deviation of expected overlap); Fold (Fold change: Observed/Expected); l2fold (log2 fold change). Table S2: (A) Differentially expressed genes are called at FDR= 1% and having a greater than two-fold change in expression following LPS stimulation. (B) Fold enrichments of IRF5 and RelA ChIP-seq peaks at chromatin marked regions obtained by simulation procedure (see Methods) were used to assess whether IRF5 or RelA were associated with chromatin marks for enhancers (H3K4ME1) or promoters (H3K4ME3) in BMDMs (Barish et al., 2010) and BMDCs (Garber et al., 2012). All enrichments are statistically significant (p<10 -4 ) (C) Fold enrichments obtained by simulation procedure (see Methods) were used to assess degree of overlap of the IRF5:RelA cistrome with PU.1 or PU.1-less marked promoters or enhancers as indicated. All enrichments are statistically significant (p<10 -4 ). Table S3: Genes affected by conventional IRF5 KO (A) and conditional RelA KO (B) relative to WT following LPS stimulation split into categories as indicated in Figure 3A. GM-BMDMs from conventional IRF5 KO (left panel) or conditional RelA KO were each compared to WT controls following stimulation by LPS