Reply to: Revisiting the use of structural similarity index in Hi-C

Ing-Simmons, Elizabeth; Machnik, Nick; Vaquerizas, Juan M.

doi:10.1038/s41588-023-01595-5

Matters Arising
Published: 05 December 2023

Reply to: Revisiting the use of structural similarity index in Hi-C

Elizabeth Ing-Simmons^1,2,
Nick Machnik³ &
Juan M. Vaquerizas ORCID: orcid.org/0000-0002-6583-6541^1,2

Nature Genetics volume 55, pages 2053–2055 (2023)Cite this article

1117 Accesses
1 Citations
3 Altmetric
Metrics details

Subjects

The Original Article was published on 05 December 2023

Access through your institution

Buy or subscribe

replying to H. Lee et al. Nature Genetics https://doi.org/10.1038/s41588-023-01594-6 (2023)

We previously presented CHESS (Comparison of Hi-C Experiments using Structural Similarity), an approach that applies the concept of the structural similarity index (SSIM) to Hi-C matrices¹, and demonstrated that it could be used to identify both regions with similar three-dimensional (3D) chromatin conformation across species and regions with distinct chromatin conformation under different conditions. In contrast to the claim of Lee et al.² that the SSIM output of CHESS is ‘independent’ of the input data, here we confirm that SSIM depends on both local and global properties of the input Hi-C matrices. This agrees with our original benchmark of the method as well as with a recent independent systematic evaluation of 3D genome comparison methods³. We provide here two approaches for using CHESS to highlight regions of differential genome organization for further investigation and expanded guidelines for choosing appropriate parameters and controls for these analyses.

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Differences in SSIM between the DLBCL–control comparison and the comparison of shuffled datasets highlight changes in genome organization.**

**Fig. 2: Data-based selection of SSIM and SN thresholds allows identification of windows with striking changes in genome organization.**

Data availability

Data from Diaz et al.⁴ are available from ArrayExpress under accession number E-MTAB-5875. Data from Rao et al.¹³ are available from https://data.4dnucleome.org/ under experiment set accessions 4DNESI7DEJTM (K562) and 4DNES3JX38V5 (GM12878). Data from Wutz et al.⁶ are available from https://data.4dnucleome.org/ under experiment set accessions 4DNES51Q5X3O, 4DNES7QY4JHS, 4DNESIKACYZC, 4DNESJ7ABWFM, 4DNESJAU6DPJ, 4DNESLZVKJ7V, 4DNESR381AXL, 4DNESR8I1SZG and 4DNESWO4PE7L.

Code availability

All code required to reproduce our analyses and visualization is available on GitHub as a Snakemake pipeline (https://github.com/vaquerizaslab/chess-2021) and is archived on Zenodo (https://doi.org/10.5281/zenodo.10041046).

References

Galan, S. et al. CHESS enables quantitative comparison of chromatin contact data and automatic feature extraction. Nat. Genet. https://doi.org/10.1038/s41588-020-00712-y (2020).
Lee, H., Blumberg, B., Lawrence, M. S. & Shioda, T. Revisiting the use of structural similarity index in Hi-C. Nat. Genet. https://doi.org/10.1038/s41588-023-01594-6 (2023).
Gunsalus, L. M. et al. Comparing chromatin contact maps at scale: methods and insights. Preprint at bioRxiv https://doi.org/10.1101/2023.04.04.535480 (2023).
Díaz, N. et al. Chromatin conformation analysis of primary patient tissue using a low input Hi-C method. Nat. Commun. 9, 4938 (2018).
Article PubMed PubMed Central Google Scholar
Ing-Simmons, E. et al. Independence of chromatin conformation and gene regulation during Drosophila dorsoventral patterning. Nat. Genet. 53, 487–499 (2021).
Article CAS PubMed PubMed Central Google Scholar
Wutz, G. et al. Topologically associating domains and chromatin loops depend on cohesin and are regulated by CTCF, WAPL, and PDS5 proteins. EMBO J. 36, 3573–3599 (2017).
Article CAS PubMed PubMed Central Google Scholar
Eagen, K. P. Principles of chromosome architecture revealed by Hi-C. Trends Biochem. Sci. 43, 469–478 (2018).
Article CAS PubMed PubMed Central Google Scholar
Pal, K., Forcato, M. & Ferrari, F. Hi-C analysis: from data generation to integration. Biophys. Rev. 11, 67–78 (2019).
Article PubMed Google Scholar
Lun, A. T. L. & Smyth, G. K. diffHic: a Bioconductor package to detect differential genomic interactions in Hi-C data. BMC Bioinformatics 16, 258 (2015).
Article PubMed PubMed Central Google Scholar
Cook, K. B. et al. Measuring significant changes in chromatin conformation with ACCOST. Nucleic Acids Res. 48, 2303–2311 (2020).
Article CAS PubMed PubMed Central Google Scholar
Kruse, K., Hug, C. B. & Vaquerizas, J. M. FAN-C: a feature-rich framework for the analysis and visualisation of chromosome conformation capture data. Genome Biol. 21, 303 (2020).
Article PubMed PubMed Central Google Scholar
Crane, E. et al. Condensin-driven remodelling of X chromosome topology during dosage compensation. Nature 523, 240–244 (2015).
Article CAS PubMed PubMed Central Google Scholar
Rao, S. S. P. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).
Article CAS PubMed PubMed Central Google Scholar
Dekker, J. et al. The 4D nucleome project. Nature 549, 219–226 (2017).
Article CAS PubMed PubMed Central Google Scholar
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer-Verlag, 2016).
Conway, J. R., Lex, A. & Gehlenborg, N. UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics 33, 2938–2940 (2017).
Article CAS PubMed PubMed Central Google Scholar
Lex, A., Gehlenborg, N., Strobelt, H., Vuillemot, R. & Pfister, H. UpSet: visualization of intersecting sets. IEEE Trans. Vis. Comput. Graph. 20, 1983–1992 (2014).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

Work in the Vaquerizas laboratory is supported by the Medical Research Council, UK (award reference MC_UP_160510 to J.M.V.), the Academy of Medical Sciences and the Department of Business, Energy and Industrial Strategy (award reference APR3∖1017 to J.M.V.). This work was also supported by the Deutsche Forschungsgemeinschaft (DFG) Priority Programme SPP2202: ‘Spatial Genome Architecture in Development and Disease’ (project number 422857230 to J.M.V.). Some of the data analyzed in this paper were derived from a HeLa cell line. Henrietta Lacks, and the HeLa cell line that was established from her tumor cells in 1951, has made significant contributions to scientific progress and advances in human health. We are grateful to Henrietta Lacks, now deceased, and to her surviving family members for their contributions to biomedical research. We thank K. Kruse and N. Diaz for critical reading of the manuscript.

Author information

Authors and Affiliations

MRC London Institute of Medical Sciences, London, UK
Elizabeth Ing-Simmons & Juan M. Vaquerizas
Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, UK
Elizabeth Ing-Simmons & Juan M. Vaquerizas
Institute of Science and Technology Austria, Klosterneuburg, Austria
Nick Machnik

Authors

Elizabeth Ing-Simmons
View author publications
You can also search for this author in PubMed Google Scholar
Nick Machnik
View author publications
You can also search for this author in PubMed Google Scholar
Juan M. Vaquerizas
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

E.I.-S. carried out data analysis. E.I.-S., N.M. and J.M.V. conceptualized the study, interpreted results and wrote the manuscript.

Corresponding author

Correspondence to Juan M. Vaquerizas.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Insulation score correlates between different datasets.

a, The insulation scores of control, DLBCL, and the shuffled datasets show strong pairwise correlations. b, Example locus on chromosome 2 showing high similarity between insulation scores of different Hi-C datasets in both a low insulation variance region (left, 35–36.5 Mb) and a structured region (right).

Extended Data Fig. 2 SSIM and SN correlate with genomic coverage and insulation score variance.

a, Pearson’s correlations of SSIM and SN profiles from the DLBCL-control comparison with genomic coverage of the Hi-C datasets and the variance of the insulation score in the 2 Mb window used for the CHESS analysis. b, Overall SSIM distributions derived from comparisons of GM12878 and K562 Hi-C data from Rao et al. 2014. In all cases GM12878 biological replicate 1 was used as the reference dataset for the CHESS comparisons. This sample has 1.8 billion valid pairs. Point and lines show mean ± one standard deviation across 592-603 2 Mb windows with valid SSIM values. c, Overall SN distributions for the same comparisons as in B. Point and lines show mean ± one standard deviation across 592-603 2 Mb windows with valid SSIM values.

Extended Data Fig. 3 Differences in SSIM between the DLBCL-control comparison and the comparison of mixed datasets highlight changes in genome organisation.

a, Subtraction of SSIM (mixed) from SSIM (DLBCL, control) highlights regions with changes between DLBCL and control, reproduced from Fig. 1c. b, Examples of regions where the SSIM difference profile indicates differences between DLBCL and control. Top, normalised Hi-C data at 25 kb resolution. Bottom, log2 observed / expected values for the same regions. Regions 2 and 3 are adjacent to the poorly-mappable centromeric region of chr2. CHESS performance in these regions may be improved by optimised filtering of low coverage Hi-C bins.

Extended Data Fig. 4 Global changes in SSIM due to global alterations of genome organization.

a, SSIM profiles for comparisons of Hi-C experiments from Wutz et al. 2017 (ref. ⁷). In each case unmodified (‘WT’) HeLa cells were used as the reference for the comparison and the query sample is shown in the legend. All cells are synchronised in G1 phase unless otherwise stated. b, Normalised Hi-C data at 50 kb resolution for an example region on chr2, for the datasets from A. c, Overall SSIM distributions for the comparisons in A. Point and lines show mean ± one standard deviation across 301–305 4 Mb windows with valid SSIM values. d, Overall SN distributions for the comparisons in A. Point and lines show mean ± one standard deviation across 301–305 4 Mb windows with valid SSIM values. e, Pearson’s correlations of the SSIM profile for comparisons in A with genomic coverage of the Hi-C datasets used in the comparison.

Extended Data Fig. 5 Regions of interest identified from different biological replicates have significant overlap.

a, UpSet plot^20,21 showing overlap between regions of interest identified from different pairwise comparisons between individual biological replicates of GM12878 and K562. Left, Number of regions passing thresholds in each comparison. Bottom right, intersections between region sets are indicated by linked dots. Top right, interaction size is indicated by the height of the bar. b, The overlap between regions of interest from different comparisons is highly significant (Two-sided Fisher’s exact test p-value and estimated odds ratio).

Extended Data Fig. 6 Relationship between SSIM, SSIM difference, and SN.

Scatterplots showing the relationship between SSIM and SSIM difference to a reference comparison for the DLBCL dataset (a) and the GM12878 and K562 datasets (b). Points are coloured such that regions below the appropriate SN threshold for the given dataset are coloured in shades of grey, while regions with SN values above the threshold are shown in shades of blue. The SN threshold used for the DLBCL dataset was calculated using the 90th percentile of SN values from the mixA-mixB comparison.

Supplementary information

Supplementary Information

Supplementary Note.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ing-Simmons, E., Machnik, N. & Vaquerizas, J.M. Reply to: Revisiting the use of structural similarity index in Hi-C. Nat Genet 55, 2053–2055 (2023). https://doi.org/10.1038/s41588-023-01595-5

Download citation

Received: 15 October 2021
Accepted: 25 October 2023
Published: 05 December 2023
Issue Date: December 2023
DOI: https://doi.org/10.1038/s41588-023-01595-5

This article is cited by

Revisiting the use of structural similarity index in Hi-C
- Hanjun Lee
- Bruce Blumberg
- Toshihiro Shioda
Nature Genetics (2023)