Genome-wide analysis reveals no evidence of trans chromosomal regulation of mammalian immune development

It has been proposed that interactions between mammalian chromosomes, or transchromosomal interactions (also known as kissing chromosomes), regulate gene expression and cell fate determination. Here we aimed to identify novel transchromosomal interactions in immune cells by high-resolution genome-wide chromosome conformation capture. Although we readily identified stable interactions in cis, and also between centromeres and telomeres on different chromosomes, surprisingly we identified no gene regulatory transchromosomal interactions in either mouse or human cells, including previously described interactions. We suggest that advances in the chromosome conformation capture technique and the unbiased nature of this approach allow more reliable capture of interactions between chromosomes than previous methods. Overall our findings suggest that stable transchromosomal interactions that regulate gene expression are not present in mammalian immune cells and that lineage identity is governed by cis, not trans chromosomal interactions.

It has been proposed that interactions between mammalian chromosomes, or transchromosomal interactions (also known as kissing chromosomes), regulate gene expression and cell fate determination. Here we aimed to identify novel transchromosomal interactions in immune cells by high-resolution genome-wide chromosome conformation capture. Although we readily identified stable interactions in cis, and also between centromeres and telomeres on different chromosomes, surprisingly we identified no gene regulatory transchromosomal interactions in either mouse or human cells, including previously described interactions. We suggest that advances in the chromosome conformation capture technique and the unbiased nature of this approach allow more reliable capture of interactions between chromosomes than previous methods. Overall our findings suggest that stable transchromosomal interactions that regulate gene expression are not present in mammalian immune cells and that lineage identity is governed by cis, not trans chromosomal interactions.

Author summary
It is a widely held belief that, in the darkness of the nucleus, strands of DNA that make up different chromosomes frequently meet to 'kiss'. These kisses, or transchromosomal interactions, are thought to be important for the expression of genes and thus cell development. Here, we aimed to identify novel transchromosomal interactions in mouse and human immune cells by high-resolution genome-wide chromosome conformation capture methods. Although we readily identified stable interactions within chromosomes and also between centromeres and telomeres on different chromosomes, surprisingly we identified no gene regulatory transchromosomal interactions in either mouse or human cells, including those previously described. Overall

Introduction
Each chromosome contains just one DNA molecule. Recent technological advances have allowed characterisation of the elaborate three-dimensional structures that form from this DNA [1]. These structures include topologically associated domains, which partition the chromosome, and elegant DNA loops that link gene promoters to distant enhancers. In addition to these intrachromosomal structures formed within the same DNA molecule, there are transchromosomal interactions formed between different chromosomes. Relative to intrachromosomal interactions, the frequency, nature and function of transchromosomal interactions are poorly understood [2]. In contrast to the multitude of intrachromosomal interactions known to regulate gene expression, only a handful of transchromosomal interactions have been described. For example, transchromosomal interactions were reported to be crucial for the appropriate expression of a single olfactory gene amongst the~1300 within the genome [3,4] and for X chromosome inactivation [5][6][7]. Interestingly, a large number of the reported transchromosomal interactions have been characterised in cells of the immune system. For example, in both mouse and human T cells the insulin like growth factor 2 (Igf2) locus was reported to interact with a number of loci on different chromosomes [8,9]. Also in T cells, a regulatory region on mouse chromosome 11 (the T helper 2 locus control region; LCR) was suggested to interact with loci encoding the cytokine interferon gamma (Ifng) on chromosome 10 [10] and interleukin 17 (IL-17) on chromosome 1 [11]. Perturbation of these interactions was associated with altered expression of Ifng and IL-17, respectively. In mouse B cell progenitors, the interaction between the immunoglobulin heavy chain (Igh) locus on chromosome 12 and the immunoglobulin light chain (Igk) locus on chromosome 6 was important for the rearrangement of the heavy chain locus [12].
These transchromosomal interactions were all identified by either chromatin conformation capture, in which crosslinking, dilution of a ligation reaction and PCR are used to deduce the relative physical proximity of two loci in three-dimensions, or DNA FISH in which microscopy and labelled probes are used to locate loci within individual nuclei, or both. These techniques are targeted approaches. Here we aimed to use an unbiased, genome-wide approach to identify novel gene regulatory transchromosomal interactions in three distinct mouse and human immune cell populations. Unexpectedly, we found very few interactions between chromosomes, and none were gene regulatory or conserved. Overall, our findings question the existence of stable, gene-regulatory transchromosomal interactions underlying immune cell identity.

Results
To elucidate novel transchromosomal interactions, we generated in situ HiC libraries from both mouse and human B cells and CD4 + and CD8 + T cells of the immune system (S1A and S1B Fig). The resulting~200 million paired-end reads were then mapped to the appropriate genome, filtered for artefacts, such as dangling ends and self-circling reads, and counted into 50kb bins with the diffHic software package [13]. DNA-DNA interactions were detected by comparing the interaction intensity in each bin to those surrounding it to determine significant interactions relative to background [14].
Using this pipeline we detected hundreds of interactions between chromosomes in each cell population (S1 Table). Furthermore, our data and publicly available promoter capture HiC data [15] validated numerous previously reported interactions within chromosomes (Fig 1A, S1C-S1E Fig). These include lineage specific interactions [16][17][18] and others seen in multiple cell lineages [19,20]. Consistent with previous literature [21], transchromosomal interactions are enriched in gene-rich, centrally located chromosomes (Fig 1B, S2A Fig). However, closer examination of these interactions reveals that a high percentage (74-90% in mouse and 82-94% in human) contain regions recommended to be removed, or 'blacklisted', from analyses due to their high or low mappability, repeated nature, location within telomeres or centromeres, among others [22,23]. After application of blacklisting the majority of transchromosomal interactions are removed (Fig 1B and 1C, S2 Table). This is in stark contrast to intrachromosomal interactions, of which less than 3% contain blacklisted regions (Fig 1C). The majority of transchromosomal interactions remaining after blacklisting linked regions close to telomeres (Fig 1D and 1E Fig 1E, S2C Fig). Thus it appears that the majority of the transchromosomal interactions detected in mammalian immune cells may be a consequence of telomeric and centromeric clustering [24]. Additional experiments would be required to characterize the true specificity and possible functionality of these interactions. Importantly, the detection of these interactions confirms that in situ HiC is able to detect interactions between chromosomes.
To determine if any of the detected transchromosomal interactions, whether associated with telomeres or centromeres or not, have a gene regulatory function, we examined the relationship between lineage-specific transchromsomal interactions (those found in only one of the cell populations) (S2 Table) and expression of gene associated with these interactions [25]. In the mouse, we found that the 15 lineage-specific transchromosomal interactions (3 B cell, 8 CD8 + T cell and 4 CD4 + T cell) overlap only 3 genes (Cct4, Lars2, Hjurp) expressed (>5 RPKM) in any of the three lineages and none of these was expressed specifically, or differentially, in the lineage exhibiting the lineage-specific transchromosomal interaction. Similarly, in humans, we found that none of the 38 lineage-specific transchromosomal interactions (18 B cell, 5 CD8 + T cell and 15 CD4 + T cell)(S2 Table) associated with any protein-coding genes differentially expressed (>5 RPKM) in the lineage exhibiting the lineage-specific transchromosomal interaction. This suggests that none of the detected lineage-specific transchromosomal interactions perform a gene regulatory function in mouse or human B or T cells.
It has been suggested that if transchromosomal interactions were functionally important they would be evolutionarily conserved [2]. Therefore, we examined the handful of genes and genomic regions associated with all transchromosomal interactions in mouse and human B and T cells. We found that none of the lineage-specific transchromosomal interactions link orthologous regions in mouse and human.
As we were able to detect transchromosomal interactions, but none of a gene regulatory nature, we examined regions previously reported to be involved in regulatory interactions between chromosomes. We examined our CD4 + T cell data for interactions between the previously mentioned LCR region on mouse chromosome 11 and loci encoding the cytokine interferon gamma (Ifng) on chromosome 10 [10] and interleukin 17 (IL-17) on chromosome 1 [11]. Curiously, no interactions were detected between the LCR and Ifng or IL17 loci in mouse CD4 + T cells (Fig 2A-2D). Intrachromosomal interactions at the loci exhibited three-dimensional structure as expected (Fig 2E and 2F), indicating that the in situ HiC data was of sufficient quality. Similarly, in human CD4 + T cells we found no interactions between the LCR and Ifng or IL17 loci (Fig 2G-2J). Again, intrachromosomal interactions at the loci were as expected (Fig 2K and 2L). These analyses were repeated with raw data (no artefact removal  To determine if the depth of sequencing of our in situ HiC had inhibited detection of the previously reported transchromosomal interactions, we examined publicly available promoter capture HiC data from human CD4 + T cells [15]. The LCR-Ifng or IL17 interactions were also undetectable in this extremely high-resolution data (S3E and S3F Fig).
We then attempted to detect another previously reported transchromosomal interaction suggested to occur between the immunoglobulin heavy (Igh) and light chain (Igk) loci in mouse B cell progenitors [12]. Our transchromosomal interaction detection pipeline was applied to in situ HiC libraries generated from two B cell progenitors: pro-B cells and immature B cells. Curiously again, using our unbiased, genome-wide approach, we found no interactions between Igh on chromosome 12  In summary, using an unbiased, genome-wide approach we detect neither novel, nor previously reported, gene-regulatory transchromosomal interactions in three dominant mouse and human immune cell populations.

Discussion
For many years DNA Fluorescent In situ Hybridisation (FISH) [26] and chromatin conformation capture (3C) [27] were the dominant technologies used to examine chromosomal interactions, whether in cis or trans. However, incongruous results from FISH versus 3C within cell types, or in fact from the same technique between studies, has been a persistent issue when examining transchromosomal interactions. For example, two studies reporting transchromosomal interactions between Ifg2 and loci on other chromosomes in mouse T cells found no common interactions [8,9], while studies of interactions in human T cells found contradictory evidence of interaction [28][29][30].
To address this vexed issue, we used the in situ HiC technique to search for transchromosomal interactions across two species and three distinct cell populations. With this unbiased, genomewide approach, we were unable to detect any conserved, gene regulatory transchromosomal interactions. While our findings are clear and suggest gene regulatory transchromosomal interactions do not function in the mammalian immune system, it is not possible to be totally conclusive about a negative finding. For example, we cannot rule out gene regulatory interactions that are weak, transient, present in highly repetitive regions or in regions without MboI restriction sites. Furthermore, because we used only male-derived DNA we could not examine interactions reported to occur between X chromosomes during X chromosome inactivation [31]. encloses the region shown in Fig 2A. (D) Expanded HiC contact matrix of regions on chromosome 1 and 11 in mouse CD4 + T cells previously reported to interact. Dotted square encloses the region shown in Fig 2B.  No evidence of mammalian immune trans chromosomal regulation Although we were unable to detect gene regulatory transchromosomal interactions, we do detect large numbers of interactions between sub-centromeric and sub-telomeric regions in all cell populations. In addition to demonstrating that in situ HiC is able to detect physiologically relevant transchromosomal interactions, these interactions may provide a genomic window into three-dimensional nuclear architecture. For example, changes in interactions in particular centromere associated clusters detected by in situ HiC might betray changes in nuclear architecture, such as relocating nucleoli, around which some centromeres are known cluster [32]. These kinds of analyses may also provide insight into previously observed transchromosomal interactions thought to be a consequence of nuclear reorganization [30].
Physiologically relevant transchromosomal interactions that are transient and/or weak may not be detectable by in situ HiC. However, this does not explain the absence of the interactions between LCR and Ifng or IL17 loci in T cells, or the immunoglobulin loci in B cell progenitors, as these interactions are reported to occur in 40-50% of cells [10,12] and the interactions are reported to be as strong as intrachromosomal interactions [10].
Differences between results presented here and those previously reported are likely due to differences in methodology. Previous studies relied on targeted amplification-dependent chromatin capture techniques and/or DNA FISH. It is increasingly clear that even with the appropriate controls [27], a minute amplification bias in a targeting probe combined with the large number of amplification steps required for 3C-based approaches can lead to false positives [2]. Furthermore, it has been suggested that up to half of the ligation events in chromatin capture techniques that rely on dilution of the ligation reaction to deduce proximity, such as 3C or 'dilution' HiC, link regions of DNA that were not truly associated in the intact nucleus [33]. Although DNA-FISH does not exhibit amplification bias, it does suffer from the resolution limitations of light microscopy (250-500nm). Thus it may be that the Igh and Igk loci in B cell progenitors, or other FISH-demonstrated interactions, frequently lie within hundreds of nanometres of each other, but are nevertheless not sufficiently proximate to be regulatory or chemically crosslinked and thus detected by in situ HiC.
In summary, the unbiased, genome-wide in situ HiC approach found no evidence for the existence of conserved, lineage-specific, gene regulatory transchromosomal interactions in mammalian immune cells, bringing into question the existence of stable, gene-regulatory transchromosomal interactions underlying immune cell identity. Anonymized human samples were obtained from a volunteer blood donor registry (http:// www.blooddonorregistry.org/home/), which requires donors give consent to their donation being used for research purposes, thus no specific consent was required, or acquired, for the work.

Cell isolation
All animal experiments were performed using C57B/6 male mice at age 6-8 w. Mice were maintained at The Walter and Eliza Hall Institute Animal Facility under specific pathogenfree conditions. Males were randomly chosen from the relevant pool.
Flow cytometric analyses were performed on BD FACSCanto with sorting on the BD Aria or Influx (BD Bioscience). Antibodies were purchased from BD Bioscience or eBioscience (S3 Table).

HiC
HiC was performed as previously published [14]. Primary immune cell libraries for both human and mouse were generated in biological duplicate. Libraries were sequenced on an Illumina NextSeq 500 to produce 75bp paired-end reads. Between 160 million and 375 million valid read pairs were generated per sample (S4 Table). Hi-C sequencing data for mouse pro-B cells and immature B cells was obtained from gene expression omnibus accession number GSE99163.

Total RNA isolation
RNA was isolated using the miRNeasy Micro Kit (QIAGEN) following manufacturer's instructions.

RNA-seq analysis
All samples were acquired from two male human donors. Each donor provided one sample per biological condition, giving each condition two replicates. RNA libraries were prepared using an Illumina's TruSeq Total Stranded RNA kit with Ribo-zero Gold (Illumina) according to the manufacturer's instructions. The rRNA-depleted RNA was purified, and reverse transcribed using SuperScript II reverse transcriptase (Invitrogen). Total RNA-Seq libraries were sequenced on the Illumina NextSeq 500 generating 80 base pair paired end reads. The reads were aligned to the human genome (GRCh38/hg38) using the Rsubread aligner [34]. The number of fragments overlapping Ensembl genes were summarized using featureCounts [35].
Differential expression analyses were undertaken using the edgeR [36] and limma [37] software packages. Any gene which did not achieve a count per million mapped reads (CPM) of 0.1 in at least 2 samples was deemed to be unexpressed and subsequently filtered from the analysis. Compositional differences between libraries were normalized using the trimmed mean of log expression ratios (TMM) [38] method. Counts were transformed to log 2 -CPM with associated precision weights using voom [39]. Differential expression was assessed using linear models and robust empirical Bayes moderated t-statistics [40]. P-values were adjusted to control the false discovery rate (FDR) below 5% using the Benjamini and Hochberg method. To increase precision, the linear model incorporated a correction for a donor batch effect.

HiC data processing
Read processing and alignment. Reads from each sample were aligned using the presplit_map.py script in the diffHic package v1.4.0 [13]. Briefly, reads were split into 5' and 3' segments if they contained the MboI ligation signature (GATCGATC), using cutadapt v0.9.5 [41] with default parameters. Segments and unsplit reads were aligned to the GRCm38/mm10 build of the Mus musculus genome or the GRCh38/hg38 build of the Homo sapiens genome using bow-tie2 v2.2.5 [42] in single-end mode. All alignments from a single library were pooled together and the resulting BAM file was sorted by read name. The FixMateInformation command from the Picard suite v1.117 (https://broadinstitute.github.io/picard/) was applied to synchronise mate information for each read pair. Alignments were resorted by position and potential duplicates were marked using the MarkDuplicates command, prior to a final resorting by name. This was repeated for each library generated from each sample in the data set. Each BAM file was further processed to identify the MboI restriction fragment that each read was aligned to. This was performed using the preparePairs function in diffHic, after discarding reads marked as duplicates and those with mapping quality scores below 10. Thresholds were applied to remove artefacts in the libraries, (S4 Table). Read pairs were ignored if one read was unmapped or discarded, or if both reads were assigned to the same fragment in the same orientation. Pairs of inward-facing reads or outward-facing reads on the same chromosome separated by less than a certain distance (min.inward and min.outward respectively) were also treated as dangling ends and were removed. For each read pair, the fragment size was calculated based on the distance of each read to the end of its restriction fragment. Read pairs with fragment sizes above~1200 bp (max.frag) were considered to be products of off-site digestion and removed. In this manner, approximately 70-75% of read pairs were successfully assigned to restriction fragments in each library. An estimate of alignment error was obtained by comparing the mapping location of the 3' segment of each chimeric read with that of the 5' segment of its mate. If the two segments were not inward-facing and separated by less than~1200 bp (chim.dist), then a mapping error was considered to be present. Of all the chimeric read pairs for which this evaluation could be performed, around 1-5% were estimated to have errors, indicating that alignment was generally successful. Technical replicates of the same library from multiple sequence runs were then merged with the mergePairs function of diffHic.
Data correction and detecting loop interactions. Loop interactions were detected using methods in the diffHic package. Read pairs were counted into 50 kbp bin pairs (with bin boundaries rounded up or down to the nearest MboI restriction site or blacklisted region edge (see below), respectively) using the squareCounts function. Only read pairs mapped to a placed scaffold were included therefore unlocalized and unplaced scaffolds were not included. Mitochondrion read pairs were also excluded.
Looping interactions were detected using a method similar to that described previously [14]. Specifically, read pairs were counted in bin pairs for all libraries of a given cell type or condition. For each bin pair, the log-fold change over the average abundance of each of several neighbouring regions was computed. Neighbouring regions in the interaction space included a square quadrant of sides 'x+1' that was closest to the diagonal and contained the target bin pair in its corner; a horizontal stripe of length '2x+1' centred on the target bin pair; a vertical stripe of '2x+1', similarly centred; and a square of sides '2x+1', also containing the target bin pair in the centre. The enrichment value for each bin pair was defined as the minimum of these log-fold changes, i.e., the bin pair had to have intensities higher than all neighbouring regions to obtain a large enrichment value. These enrichment values were calculated using the enrichedPairs function in diffHic, with 'x' set to 5 bin sizes (i.e., 250 kbp). Putative loops were then defined as those with enrichment values above 0.5, with average count across libraries greater than 10, and that were more than 1 bin size away from the diagonal.
Blacklisted regions and removal of centromere and telomere loops. Blacklisted genomic regions were obtained from ENCODE for hg38 and mm10 [23]. Loops that that had at least one anchor in a blacklisted genomic region were removed. Additionally, loops found with an anchor found within a centromere or telomere region as defined by UCSC genome annotation were removed.
Finding overlaps between bin pairs. Overlaps between bin pairs were performed using the overlapsAny function in the InteractionSet package with type = equal and maxgap = 100kb [43]. This considers an overlap to be present if anchors have a separation of less than the maxgap value and if both anchors of the bins pairs overlap.

Promoter capture Hi-C data processing
Promoter capture Hi-C sequencing data for human naive CD4 + T cells was obtained from EGA (https://www.ebi.ac.uk/ega) accession number EGAS00001001911. The read processing and alignment was with the same methods as the Hi-C data except, as the restriction enzyme HindIII was used in the assay, the reads were split with a ligation signature of AAGCTAGCTT.

Visualization of results
Plaid plots were constructed using the plotPlaid function from the diffHic package. The range of colour intensities in each plot was scaled according to the library size of the sample, to facilitate comparisons between plots from different samples. Heatmaps of the loops between chromosomes where generated using the R package gplots with the function heatmap.2. Circos plots were generated with the R package RCircos [44].