Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Matters Arising
  • Published:

Reply to: Revisiting the use of structural similarity index in Hi-C

The Original Article was published on 05 December 2023

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Differences in SSIM between the DLBCL–control comparison and the comparison of shuffled datasets highlight changes in genome organization.
Fig. 2: Data-based selection of SSIM and SN thresholds allows identification of windows with striking changes in genome organization.

Data availability

Data from Diaz et al.4 are available from ArrayExpress under accession number E-MTAB-5875. Data from Rao et al.13 are available from https://data.4dnucleome.org/ under experiment set accessions 4DNESI7DEJTM (K562) and 4DNES3JX38V5 (GM12878). Data from Wutz et al.6 are available from https://data.4dnucleome.org/ under experiment set accessions 4DNES51Q5X3O, 4DNES7QY4JHS, 4DNESIKACYZC, 4DNESJ7ABWFM, 4DNESJAU6DPJ, 4DNESLZVKJ7V, 4DNESR381AXL, 4DNESR8I1SZG and 4DNESWO4PE7L.

Code availability

All code required to reproduce our analyses and visualization is available on GitHub as a Snakemake pipeline (https://github.com/vaquerizaslab/chess-2021) and is archived on Zenodo (https://doi.org/10.5281/zenodo.10041046).

References

  1. Galan, S. et al. CHESS enables quantitative comparison of chromatin contact data and automatic feature extraction. Nat. Genet. https://doi.org/10.1038/s41588-020-00712-y (2020).

  2. Lee, H., Blumberg, B., Lawrence, M. S. & Shioda, T. Revisiting the use of structural similarity index in Hi-C. Nat. Genet. https://doi.org/10.1038/s41588-023-01594-6 (2023).

  3. Gunsalus, L. M. et al. Comparing chromatin contact maps at scale: methods and insights. Preprint at bioRxiv https://doi.org/10.1101/2023.04.04.535480 (2023).

  4. Díaz, N. et al. Chromatin conformation analysis of primary patient tissue using a low input Hi-C method. Nat. Commun. 9, 4938 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Ing-Simmons, E. et al. Independence of chromatin conformation and gene regulation during Drosophila dorsoventral patterning. Nat. Genet. 53, 487–499 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Wutz, G. et al. Topologically associating domains and chromatin loops depend on cohesin and are regulated by CTCF, WAPL, and PDS5 proteins. EMBO J. 36, 3573–3599 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Eagen, K. P. Principles of chromosome architecture revealed by Hi-C. Trends Biochem. Sci. 43, 469–478 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Pal, K., Forcato, M. & Ferrari, F. Hi-C analysis: from data generation to integration. Biophys. Rev. 11, 67–78 (2019).

    Article  PubMed  Google Scholar 

  9. Lun, A. T. L. & Smyth, G. K. diffHic: a Bioconductor package to detect differential genomic interactions in Hi-C data. BMC Bioinformatics 16, 258 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Cook, K. B. et al. Measuring significant changes in chromatin conformation with ACCOST. Nucleic Acids Res. 48, 2303–2311 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Kruse, K., Hug, C. B. & Vaquerizas, J. M. FAN-C: a feature-rich framework for the analysis and visualisation of chromosome conformation capture data. Genome Biol. 21, 303 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Crane, E. et al. Condensin-driven remodelling of X chromosome topology during dosage compensation. Nature 523, 240–244 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Rao, S. S. P. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Dekker, J. et al. The 4D nucleome project. Nature 549, 219–226 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer-Verlag, 2016).

  16. Conway, J. R., Lex, A. & Gehlenborg, N. UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics 33, 2938–2940 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Lex, A., Gehlenborg, N., Strobelt, H., Vuillemot, R. & Pfister, H. UpSet: visualization of intersecting sets. IEEE Trans. Vis. Comput. Graph. 20, 1983–1992 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

Work in the Vaquerizas laboratory is supported by the Medical Research Council, UK (award reference MC_UP_160510 to J.M.V.), the Academy of Medical Sciences and the Department of Business, Energy and Industrial Strategy (award reference APR31017 to J.M.V.). This work was also supported by the Deutsche Forschungsgemeinschaft (DFG) Priority Programme SPP2202: ‘Spatial Genome Architecture in Development and Disease’ (project number 422857230 to J.M.V.). Some of the data analyzed in this paper were derived from a HeLa cell line. Henrietta Lacks, and the HeLa cell line that was established from her tumor cells in 1951, has made significant contributions to scientific progress and advances in human health. We are grateful to Henrietta Lacks, now deceased, and to her surviving family members for their contributions to biomedical research. We thank K. Kruse and N. Diaz for critical reading of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

E.I.-S. carried out data analysis. E.I.-S., N.M. and J.M.V. conceptualized the study, interpreted results and wrote the manuscript.

Corresponding author

Correspondence to Juan M. Vaquerizas.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Insulation score correlates between different datasets.

a, The insulation scores of control, DLBCL, and the shuffled datasets show strong pairwise correlations. b, Example locus on chromosome 2 showing high similarity between insulation scores of different Hi-C datasets in both a low insulation variance region (left, 35–36.5 Mb) and a structured region (right).

Extended Data Fig. 2 SSIM and SN correlate with genomic coverage and insulation score variance.

a, Pearson’s correlations of SSIM and SN profiles from the DLBCL-control comparison with genomic coverage of the Hi-C datasets and the variance of the insulation score in the 2 Mb window used for the CHESS analysis. b, Overall SSIM distributions derived from comparisons of GM12878 and K562 Hi-C data from Rao et al. 2014. In all cases GM12878 biological replicate 1 was used as the reference dataset for the CHESS comparisons. This sample has 1.8 billion valid pairs. Point and lines show mean ± one standard deviation across 592-603 2 Mb windows with valid SSIM values. c, Overall SN distributions for the same comparisons as in B. Point and lines show mean ± one standard deviation across 592-603 2 Mb windows with valid SSIM values.

Extended Data Fig. 3 Differences in SSIM between the DLBCL-control comparison and the comparison of mixed datasets highlight changes in genome organisation.

a, Subtraction of SSIM (mixed) from SSIM (DLBCL, control) highlights regions with changes between DLBCL and control, reproduced from Fig. 1c. b, Examples of regions where the SSIM difference profile indicates differences between DLBCL and control. Top, normalised Hi-C data at 25 kb resolution. Bottom, log2 observed / expected values for the same regions. Regions 2 and 3 are adjacent to the poorly-mappable centromeric region of chr2. CHESS performance in these regions may be improved by optimised filtering of low coverage Hi-C bins.

Extended Data Fig. 4 Global changes in SSIM due to global alterations of genome organization.

a, SSIM profiles for comparisons of Hi-C experiments from Wutz et al. 2017 (ref. 7). In each case unmodified (‘WT’) HeLa cells were used as the reference for the comparison and the query sample is shown in the legend. All cells are synchronised in G1 phase unless otherwise stated. b, Normalised Hi-C data at 50 kb resolution for an example region on chr2, for the datasets from A. c, Overall SSIM distributions for the comparisons in A. Point and lines show mean ± one standard deviation across 301–305 4 Mb windows with valid SSIM values. d, Overall SN distributions for the comparisons in A. Point and lines show mean ± one standard deviation across 301–305 4 Mb windows with valid SSIM values. e, Pearson’s correlations of the SSIM profile for comparisons in A with genomic coverage of the Hi-C datasets used in the comparison.

Extended Data Fig. 5 Regions of interest identified from different biological replicates have significant overlap.

a, UpSet plot20,21 showing overlap between regions of interest identified from different pairwise comparisons between individual biological replicates of GM12878 and K562. Left, Number of regions passing thresholds in each comparison. Bottom right, intersections between region sets are indicated by linked dots. Top right, interaction size is indicated by the height of the bar. b, The overlap between regions of interest from different comparisons is highly significant (Two-sided Fisher’s exact test p-value and estimated odds ratio).

Extended Data Fig. 6 Relationship between SSIM, SSIM difference, and SN.

Scatterplots showing the relationship between SSIM and SSIM difference to a reference comparison for the DLBCL dataset (a) and the GM12878 and K562 datasets (b). Points are coloured such that regions below the appropriate SN threshold for the given dataset are coloured in shades of grey, while regions with SN values above the threshold are shown in shades of blue. The SN threshold used for the DLBCL dataset was calculated using the 90th percentile of SN values from the mixA-mixB comparison.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ing-Simmons, E., Machnik, N. & Vaquerizas, J.M. Reply to: Revisiting the use of structural similarity index in Hi-C. Nat Genet 55, 2053–2055 (2023). https://doi.org/10.1038/s41588-023-01595-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-023-01595-5

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research