Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

Streamlining spatial omics data analysis with Pysodb

Abstract

Advances in spatial omics technologies have improved the understanding of cellular organization in tissues, leading to the generation of complex and heterogeneous data and prompting the development of specialized tools for managing, loading and visualizing spatial omics data. The Spatial Omics Database (SODB) was established to offer a unified format for data storage and interactive visualization modules. Here we detail the use of Pysodb, a Python-based tool designed to enable the efficient exploration and loading of spatial datasets from SODB within a Python environment. We present seven case studies using Pysodb, detailing the interaction with various computational methods, ensuring reproducibility of experimental data and facilitating the integration of new data and alternative applications in SODB. The approach offers a reference for method developers by outlining label and metadata availability in representative spatial data that can be loaded by Pysodb. The tool is supplemented by a website (https://protocols-pysodb.readthedocs.io/) with detailed information for benchmarking analysis, and allows method developers to focus on computational models by facilitating data processing. This protocol is designed for researchers with limited experience in computational biology. Depending on the dataset complexity, the protocol typically requires ~12 h to complete.

Key points

  • Pysodb allows researchers to load and explore spatial omics data in a Python environment. Data loaded using Pysodb follow the AnnData format, thus providing a unified format for storing over 3,000 datasets and facilitating benchmarking and reuse of data.

  • Alternative packages such as Scanpy, Squidpy and Giotto focus on data analysis; Pysodb complements them by providing a support platform for data storage and handling.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of the protocol.
Fig. 2: Pysodb can interact with spatially variable gene detection.
Fig. 3: Pysodb can interact with spatial clustering analysis.
Fig. 4: Pysodb can interact with pseudo-spatiotemporal analysis.
Fig. 5: Pysodb can interact with spatial data integration.
Fig. 6: Pysodb can interact with spatial data alignment.
Fig. 7: Pysodb can interact with spatial spot deconvolution.
Fig. 8: Detailed information about representative datasets available for Pysodb.

Similar content being viewed by others

Data availability

The spatial datasets discussed in this protocol are available from SODB (https://gene.ai.tencent.com/SpatialOmics/). We provide guidelines on how to load and visualize these data at https://protocols-pysodb.readthedocs.io/en/latest/SOView/SOView.html#. The mouse cortex single-cell data are provided at https://figshare.com/articles/dataset/Visium/22332667. The human PDAC single-cell data are provided at https://figshare.com/articles/dataset/PDAC/22332574.

Code availability

Pysodb is a freely available software package written in the Python programming language. Source code can be found at https://github.com/TencentAILabHealthcare/pysodb. Installation instructions can be found at https://pysodb.readthedocs.io/en/latest/. The code used in this paper can be found at https://protocols-sodb.readthedocs.io/en/latest/. A Python version of SOView code can be found at https://github.com/yuanzhiyuan/SOView. An SOView tutorial can be found at https://soview-doc.readthedocs.io/en/latest/index.html.

References

  1. Moffitt, J. R., Lundberg, E. & Heyn, H. The emerging landscape of spatial profiling technologies. Nat. Rev. Genet. https://doi.org/10.1038/s41576-022-00515-3 (2022).

    Article  PubMed  Google Scholar 

  2. Vandereyken, K., Sifrim, A., Thienpont, B. & Voet, T. Methods and applications for single-cell and spatial multi-omics. Nat. Rev. Genet.https://doi.org/10.1038/s41576-023-00580-2 (2023).

  3. Rao, A., Barkley, D., Franca, G. S. & Yanai, I. Exploring tissue architecture using spatial transcriptomics. Nature 596, 211–220 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  4. Palla, G., Fischer, D. S., Regev, A. & Theis, F. J. Spatial components of molecular tissue biology. Nat. Biotechnol. https://doi.org/10.1038/s41587-021-01182-1 (2022).

    Article  PubMed  Google Scholar 

  5. Andreou, C., Weissleder, R. & Kircher, M. F. Multiplexed imaging in oncology. Nat. Biomed. Eng. 6, 527–540 (2022).

    Article  PubMed  Google Scholar 

  6. Hildebrandt, F. et al. Spatial transcriptomics to define transcriptional patterns of zonation and structural components in the mouse liver. Nat. Commun. 12, 7046 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  7. Zhang, M. et al. Spatially resolved cell atlas of the mouse primary motor cortex by MERFISH. Nature 598, 137–143 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  8. Chen, A. et al. Spatiotemporal transcriptomic atlas of mouse organogenesis using DNA nanoball-patterned arrays. Cell https://doi.org/10.1016/j.cell.2022.04.003 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Lohoff, T. et al. Integration of spatial and single-cell transcriptomic data elucidates mouse organogenesis. Nat. Biotechnol. https://doi.org/10.1038/s41587-021-01006-2 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Marshall, J. L. et al. High-resolution Slide-seqV2 spatial transcriptomics enables discovery of disease-specific cell neighborhoods and pathways. iScience 25, 104097 (2022).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  11. Yuan, Z. et al. SEAM is a spatial single nuclear metabolomics method for dissecting tissue microenvironment. Nat. Methods 18, 1223–1232 (2021).

    Article  CAS  PubMed  Google Scholar 

  12. Keren, L. et al. A structured tumor-immune microenvironment in triple negative breast cancer revealed by multiplexed ion beam imaging. Cell 174, 1373–1387 (2018).

    Article  Google Scholar 

  13. Schürch, C. M. et al. Coordinated cellular neighborhoods orchestrate antitumoral immunity at the colorectal cancer invasive front. Cell https://doi.org/10.1016/j.cell.2020.07.005 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Li, Y. et al. SOAR: a spatial transcriptomics analysis resource to model spatial variability and cell type interactions. Preprint at bioRxiv https://doi.org/10.1101/2022.04.17.488596 (2022).

  15. Fan, Z., Chen, R. & Chen, X. SpatialDB: a database for spatially resolved transcriptomes. Nucleic Acids Res. https://doi.org/10.1093/nar/gkz934 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Xu, Z. et al. STOmicsDB: a comprehensive database for spatial transcriptomics data sharing, analysis and visualization. Nucleic Acids Res. https://doi.org/10.1093/nar/gkad933 (2023).

  17. Yuan, Z. et al. SODB facilitates comprehensive exploration of spatial omics data. Nat. Methods https://doi.org/10.1038/s41592-023-01773-7 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Zeng, H. et al. Integrative in situ mapping of single-cell transcriptional states and tissue histopathology in a mouse model of Alzheimer’s disease. Nat. Neurosci. 26, 430–446 (2023).

    CAS  PubMed  Google Scholar 

  20. Palla, G. et al. Squidpy: a scalable framework for spatial omics analysis. Nat. Methods https://doi.org/10.1038/s41592-021-01358-2 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  21. Hu, J. et al. SpaGCN: integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network. Nat. Methods 18, 1342–1351 (2021).

    Article  PubMed  Google Scholar 

  22. Fang, R. et al. Conservation and divergence of cortical cell organization in human and mouse revealed by MERFISH. Science 377, 56–62 (2022).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  23. Chen, A. et al. Single-cell spatial transcriptome reveals cell-type organization in the macaque cortex. Cell https://doi.org/10.1016/j.cell.2023.06.009 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Li, Z. & Zhou, X. BASS: multi-scale and multi-sample analysis enables accurate cell type clustering and spatial domain detection in spatial transcriptomic studies. Genome Biol. 23, 168 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Dong, K. & Zhang, S. Deciphering spatial domains from spatially resolved transcriptomics with an adaptive graph attention auto-encoder. Nat. Commun. 13, 1739 (2022).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  26. Zeira, R., Land, M., Strzalkowski, A. & Raphael, B. J. Alignment and integration of spatial transcriptomics data. Nat. Methods 19, 567–575 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Mirdita, M. et al. ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Moving towards reproducible machine learning. Nat. Comput. Sci. 1, 629–630 https://doi.org/10.1038/s43588-021-00152-6 (2021).

  29. Dries, R. et al. Giotto: a toolbox for integrative analysis and visualization of spatial expression data. Genome Biol. 22, 78 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Gut, G., Herrmann, M. D. & Pelkmans, L. Multiplexed protein maps link subcellular organization to cellular states. Science https://doi.org/10.1126/science.aar7042 (2018).

  31. Giesen, C. et al. Highly multiplexed imaging of tumor tissues with subcellular resolution by mass cytometry. Nat. Methods 11, 417–422 (2014).

    Article  CAS  PubMed  Google Scholar 

  32. Shah, S., Lubeck, E., Zhou, W. & Cai, L. In Situ transcription profiling of single cells reveals spatial organization of cells in the mouse hippocampus. Neuron 92, 342–357 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Chen, K. H., Boettiger, A. N., Moffitt, J. R., Wang, S. Y. & Zhuang, X. W. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 348, aaa6090 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Keren, L. et al. MIBI-TOF: a multiplexed imaging platform relates cellular phenotypes and tissue structure. Sci. Adv. https://doi.org/10.1126/sciadv.aax5851 (2019).

  35. Stickels, R. R. et al. Highly sensitive spatial transcriptomics at near-cellular resolution with Slide-seqV2. Nat. Biotechnol. https://doi.org/10.1038/s41587-020-0739-1 (2020).

  36. Wang, X. et al. Three-dimensional intact-tissue sequencing of single-cell transcriptional states. Science https://doi.org/10.1126/science.aat5691 (2018).

  37. Lin, J.-R. et al. Highly multiplexed immunofluorescence imaging of human tissues and tumors using t-CyCIF and conventional optical microscopes. eLife 7, e31657 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  38. Goltsev, Y. et al. Deep profiling of mouse splenic architecture with CODEX multiplexed imaging. Cell 174, 968–981 e915 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Codeluppi, S. et al. Spatial organization of the somatosensory cortex revealed by osmFISH. Nat. Methods 15, 932–935 (2018).

    Article  PubMed  Google Scholar 

  40. Sun, S., Zhu, J. & Zhou, X. Statistical analysis of spatial expression patterns for spatially resolved transcriptomic studies. Nat. Methods 17, 193–200 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Zhu, J., Sun, S. & Zhou, X. SPARK-X: non-parametric modeling enables scalable and robust detection of spatial expression patterns for large spatial transcriptomic studies. Genome Biol. 22, 184 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Zhao, E. et al. Spatial transcriptomics at subspot resolution with BayesSpace. Nat. Biotechnol. https://doi.org/10.1038/s41587-021-00935-2 (2021).

  43. Shang, L. & Zhou, X. Spatially aware dimension reduction for spatial transcriptomics. Nat. Commun. 13, 7203 (2022).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  44. Cable, D. M. et al. Robust decomposition of cell type mixtures in spatial transcriptomics. Nat. Biotechnol. https://doi.org/10.1038/s41587-021-00830-w (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Ma, Y. & Zhou, X. Spatially informed cell-type deconvolution for spatial transcriptomics. Nat. Biotechnol. https://doi.org/10.1038/s41587-022-01273-7 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  46. Anderson, A. & Lundeberg, J. sepal: identifying transcript profiles with spatial patterns by diffusion-based modeling. Bioinformatics https://doi.org/10.1093/bioinformatics/btab164 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  47. Stahl, P. L. et al. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science 353, 78–82 (2016).

    Article  ADS  CAS  PubMed  Google Scholar 

  48. Maynard, K. R. et al. Transcriptome-scale spatial gene expression in the human dorsolateral prefrontal cortex. Nat. Neurosci. 24, 425–436 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Borm, L. E. et al. Scalable in situ single-cell profiling by electrophoretic capture of mRNA using EEL FISH. Nat. Biotechnol. https://doi.org/10.1038/s41587-022-01455-3 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Ren, H., Walker, B. L., Cang, Z. & Nie, Q. Identifying multicellular spatiotemporal organization of cells with SpaceFlow. Nat. Commun. https://doi.org/10.1038/s41467-022-31739-w (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Chen, X., Sun, Y.-C., Church, G. M., Lee, J. H. & Zador, A. M. Efficient in situ barcode sequencing using padlock probe-based BaristaSeq. Nucleic Acids Res. 46, e22–e22 (2018).

    Article  CAS  PubMed  Google Scholar 

  52. Fu, H. et al. Unsupervised spatial embedded deep representation of spatial transcriptomics. Preprint at bioarxiv https://doi.org/10.1101/2021.06.15.448542 (2021).

  53. Long, B., Miller, J. & Consortium, T. S. SpaceTx: a roadmap for benchmarking spatial transcriptomics exploration of the brain. Preprint at https://doi.org/10.48550/arXiv.2301.08436 (2023).

  54. Biancalani, T. et al. Deep learning and alignment of spatially resolved single-cell transcriptomes with Tangram. Nat. Methods 18, 1352–1362 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  55. Tasic, B. et al. Shared and distinct transcriptomic cell types across neocortical areas. Nature 563, 72–78 (2018).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  56. Moncada, R. et al. Integrating microarray-based spatial transcriptomics and single-cell RNA-seq reveals tissue architecture in pancreatic ductal adenocarcinomas. Nat. Biotechnol. https://doi.org/10.1038/s41587-019-0392-8 (2020).

    Article  PubMed  Google Scholar 

  57. Zhao, T. et al. Spatial genomics enables multi-modal study of clonal heterogeneity in tissues. Nature 601, 85–91 (2022).

    Article  ADS  CAS  PubMed  Google Scholar 

  58. Liu, Y. et al. High-spatial-resolution multi-omics sequencing via deterministic barcoding in tissue. Cell https://doi.org/10.1016/j.cell.2020.10.026 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  59. Traag, V. A., Waltman, L. & Van Eck, N. J. From Louvain to Leiden: guaranteeing well-connected communities. Sci. Rep. 9, 1–12 (2019).

    Article  CAS  Google Scholar 

  60. Haghverdi, L., Buttner, M., Wolf, F. A., Buettner, F. & Theis, F. J. Diffusion pseudotime robustly reconstructs lineage branching. Nat. Methods 13, 845–848 (2016).

    Article  CAS  PubMed  Google Scholar 

  61. Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Network, B. I. C. C. A multimodal cell census and atlas of the mammalian primary motor cortex. Nature 598, 86–102 (2021).

    Article  Google Scholar 

  64. Tu, J.-J., Li, H.-S., Yan, H., Zhang, X.-F. & Boeva, V. EnDecon: cell type deconvolution of spatially resolved transcriptomics data via ensemble learning. Bioinformatics https://doi.org/10.1093/bioinformatics/btac825 (2023).

  65. Liao, J. et al. De novo analysis of bulk RNA-seq data at spatially resolved single-cell resolution. Nat. Commun. https://doi.org/10.1038/s41467-022-34271-z (2022).

  66. Cable, D. M. et al. Cell type-specific inference of differential expression in spatial transcriptomics. Nat. Methods 19, 1076–1087 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Jerby-Arnon, L. & Regev, A. DIALOGUE maps multicellular programs in tissue from single-cell or spatial transcriptomics data. Nat. Biotechnol. https://doi.org/10.1038/s41587-022-01288-0 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  68. Yuan, Z. et al. SOTIP is a versatile method for microenvironment modeling with spatial omics data. Nat. Commun. https://doi.org/10.1038/s41467-022-34867-5 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work was supported by by Chenguang Program of Shanghai Education Development Foundation and Shanghai Municipal Education Commission, Shanghai Science and Technology Development Funds (23YF1403000), Tencent AI Lab Rhino-Bird Focused Research Program (RBFR2023008), Shanghai Municipal Science and Technology Major Project (no. 2018SHZDZX01), ZJ Lab, Shanghai Center for Brain Science and Brain-Inspired Technology and 111 Project (no. B18015). The Innovation Fund of Institute of Computing and Technology, CAS (E161080, E161030); Beijing Natural Science Foundation Haidian Origination and Innovation Joint Fund (L222007).

Author information

Authors and Affiliations

Authors

Contributions

Y.Z. and Z.Y. conceived and designed the study. Z.Y. designed the pipeline and collected the methods and datasets. S.L. and Z.Y. completed the pipeline. Z.Y. and S.L. analyzed the results and generated the figures. J.Y. and Z.W. maintained the database. Z.Y., S.L., Z.F. and Y.Z. wrote the manuscript.

Corresponding authors

Correspondence to Yi Zhao or Zhiyuan Yuan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Protocols thanks the anonymous reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Key reference using this protocol

Yuan, Z. et al. Nat. Methods 20, 387–399 (2023): https://doi.org/10.1038/s41592-023-01773-7

Supplementary information

Supplementary Information

Supplementary Figs. 1–4, Supplementary Table 2 and Supplementary Protocols.

Reporting Summary

Supplementary Table 1

Summary of parameters.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, S., Zhao, F., Wu, Z. et al. Streamlining spatial omics data analysis with Pysodb. Nat Protoc 19, 831–895 (2024). https://doi.org/10.1038/s41596-023-00925-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41596-023-00925-5

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing