Elsevier

Marine Genomics

Volume 26, April 2016, Pages 21-28
Marine Genomics

Method paper
Pan-transcriptomic analysis identifies coordinated and orthologous functional modules in the diatoms Thalassiosira pseudonana and Phaeodactylum tricornutum

https://doi.org/10.1016/j.margen.2015.10.011Get rights and content

Abstract

Diatoms are important primary producers in the ocean that thrive in diverse and dynamic environments. Their survival and success over changing conditions depend on the complex coordination of gene regulatory processes. Here we present an integrated analysis of all publicly available microarray data for the diatoms Thalassiosira pseudonana and Phaeodactylum tricornutum. This resource includes shared expression patterns, gene functions, and cis-regulatory DNA sequence motifs in each species that are statistically coordinated over many experiments. These data illustrate the coordination of transcriptional responses in diatoms over changing environmental conditions. Responses to silicic acid depletion segregate into multiple distinctly regulated groups of genes, regulation by heat shock transcription factors (HSFs) is implicated in the response to nitrate stress, and distinctly coordinated carbon concentrating, CO2 and pH-related responses are apparent. Fundamental features of diatom physiology are similarly coordinated between two distantly related diatom species, including the regulation of photosynthesis, cellular growth functions and lipid metabolism. These integrated data and analyses can be explored publicly (http://networks.systemsbiology.net/diatom-portal/).

Introduction

Diatoms are important primary producers in marine ecosystems (Armbrust, 2009, Tsuda et al., 2003), and their ability and capacity to physiologically adjust to changing ocean conditions are critical to their ecological success, both now and in the future. The heterokonts (including diatoms) are phylogenetically distinct from other clades, and represent relatively undiscovered genomic and physiological territory. Recent studies of diatom genomes (Armbrust et al., 2004, Bowler et al., 2008), transcriptomes and proteomes (Allen et al., 2008, Ashworth et al., 2013, Brembu et al., 2011, Carvalho et al., 2011, Chauton et al., 2013, Hook and Osborn, 2012, Kustka et al., 2014, Levitan et al., 2015, Mock et al., 2008, Nymark et al., 2009, Nymark et al., 2013, Sapriel et al., 2009, Shrestha et al., 2012, Thamatrakoln et al., 2012, Thamatrakoln et al., 2013, Valle et al., 2014) hint at complex molecular and regulatory programs that control diatom physiology and acclimation to a variety of different environmental conditions. However, the specific coordination and regulation of these molecular and adaptive processes in diatoms remain largely unknown for multiple reasons: i) the size and complexity of diatom genomes and their molecular responses to change, ii) their genetic uniqueness and lack of sufficient homology in comparison to other well-studied clades, iii) low-throughput experimental genetic approaches, and iv) insufficient data and analysis methods to identify concurrent yet distinct molecular and regulatory pathways operating simultaneously under various conditions.

The integration and analysis of data collected through many different transcriptomic experiments can be used to discover fundamentally coordinated, conditional, and distinct molecular responses that are reflective of gene regulatory processes and cannot be discovered through individual experiments (Brooks et al., 2014, Danziger et al., 2014, Reiss et al., 2006). Data-driven approaches for the discovery of molecular coordination may be particularly useful in the case of diatoms, given the size and novelty of their genomes, transcriptomes and proteomes. To systematically discover and identify modules of putatively coordinated and functionally related diatom genes, we aggregated all available microarray expression data for the model diatoms Thalassiosira pseudonana and Phaeodactylum tricornutum, and performed hierarchical clustering (Eisen et al., 1998) and motif-guided biclustering (Reiss et al., 2006) over many experimental conditions. Highlights of this analysis in the context of specific conditions and functions are discussed herein, and the complete results have been made available for further use and exploration in a web portal at the following url: http://networks.systemsbiology.net/diatom-portal/.

Section snippets

Integrated diatom transcriptomic dataset

Transcriptome-wide microarray expression data T. pseudonana used in this analysis included: silica, iron, and nitrogen limitation, low temperature and elevated pH (Mock et al., 2008), exposure to pollutant and mutagen benzo[a]pyrene (Carvalho et al., 2011), iron starvation (Thamatrakoln et al., 2012), silica re-supplementation (Shrestha et al., 2012), diel growth from exponential to stationary phase (Ashworth et al., 2013), and growth at moderate and elevated CO2 levels under moderate and

Pan-transcriptomic clustering identifies separately coordinated groups of genes within singular condition-specific responses

Through integrative, whole-genome multi-experiment clustering (Fig. 1), numerous distinctly and significantly co-expressed sets of functionally-related transcripts were discovered in T. pseudonana and P. tricornutum (Table 1, Table S1). In addition, lists of transcripts that were significantly affected in individual previous experiments were segregated into separate and distinctly coordinated groups of genes. For example, transcripts whose expression increased in silica-limited T. pseudonana

Conclusion

In this work, an integrated analysis of transcriptomic data for two model diatom species (T. pseudonana and P. tricornutum) over many independent experiments resulted in the comprehensive inference of coordinated and conditional transcriptional responses that are indicative of distinct functional and gene regulatory processes operating robustly in the cell. The clustering and analysis of combined datasets performed here allowed the partitioning of the large transcriptomes of these diatom

Acknowledgments

This work was supported by the National Science Foundation (OCB-0928561 and MCB-1316206 to M.V.O. and N.S.B.)

References (40)

  • S.F. Altschul et al.

    Basic local alignment search tool

    J. Mol. Biol.

    (1990)
  • S.E. Hook et al.

    Comparison of toxicity and transcriptomic profiles in a diatom exposed to oil, dispersants, dispersed oil

    Aquat. Toxicol. Amst. Neth.

    (2012)
  • H. Pelham

    Activation of heat-shock genes in eukaryotes

    Trends Genet.

    (1985)
  • A.E. Allen et al.

    Whole-cell response of the pennate diatom Phaeodactylum tricornutum to iron starvation

    Proc. Natl. Acad. Sci. U. S. A.

    (2008)
  • E.V. Armbrust

    The life of diatoms in the world's oceans

    Nature

    (2009)
  • E.V. Armbrust et al.

    The genome of the diatom Thalassiosira pseudonana: ecology, evolution, and metabolism

    Science

    (2004)
  • J. Ashworth et al.

    Genome-wide diel growth state transitions in the diatom Thalassiosira pseudonana

    Proc. Natl. Acad. Sci. U. S. A.

    (2013)
  • T.L. Bailey et al.

    Fitting a mixture model by expectation maximization to discover motifs in biopolymers

    Proc. Int. Conf. Intell. Syst. Mol. Biol. ISMB Int. Conf. Intell. Syst. Mol. Biol.

    (1994)
  • T.L. Bailey et al.

    MEME suite: tools for motif discovery and searching

    Nucleic Acids Res.

    (2009)
  • Y. Benjamini et al.

    Controlling the false discovery rate: a practical and powerful approach to multiple testing

    J. R. Stat. Soc. Ser. B Methodol.

    (1995)
  • C. Bowler et al.

    The Phaeodactylum genome reveals the evolutionary history of diatom genomes

    Nature

    (2008)
  • T. Brembu et al.

    Genome-wide profiling of responses to cadmium in the diatom Phaeodactylum tricornutum

    Environ. Sci. Technol.

    (2011)
  • A.N. Brooks et al.

    A system-level model for the microbial regulatory genome

    Mol. Syst. Biol.

    (2014)
  • R.N. Carvalho et al.

    Transcriptomics responses in marine diatom Thalassiosira pseudonana exposed to the polycyclic aromatic hydrocarbon benzo[a]pyrene

    PLoS One

    (2011)
  • M.S. Chauton et al.

    Gene regulation of carbon fixation, storage, and utilization in the diatom Phaeodactylum tricornutum acclimated to light/dark cycles

    Plant Physiol.

    (2013)
  • S.A. Danziger et al.

    Molecular mechanisms of system responses to novel stimuli are predictable from public data

    Nucleic Acids Res.

    (2014)
  • M.B. Eisen et al.

    Cluster analysis and display of genome-wide expression patterns

    Proc. Natl. Acad. Sci.

    (1998)
  • J.M. Franco-Zorrilla et al.

    DNA-binding specificities of plant transcription factors and their potential to define target genes

    Proc. Natl. Acad. Sci.

    (2014)
  • I.V. Grigoriev et al.

    The Genome Portal of the Department of Energy Joint Genome Institute

    Nucleic Acids Res.

    (2011)
  • S. Gupta et al.

    Quantifying similarity between motifs

    Genome Biol.

    (2007)
  • Cited by (21)

    • The role of antioxidant enzymes in diatoms and their therapeutic role

      2022, Marine Antioxidants: Preparations, Syntheses, and Applications
    • New paradigm in diatom omics and genetic manipulation

      2021, Bioresource Technology
      Citation Excerpt :

      Probably 2–6% of diatom genes have been reported to be of bacterial origin acquired via horizontal gene transfer (Basu et al., 2017). Differential gene expression in diatoms is controlled either by transcription factors or regulatory elements as has been demonstrated by computational analysis of transcriptome data or by high throughput single-cell transcriptomics to quantitatively analyze cellular and physiological responses and understand interspecific functional interactions (Ashworth et al., 2016; Banerjee et al., 2016; Ku and Sebé-Pedrós, 2019). Additionally, numerous epigenetic processes like methylation of DNA and structural changes in chromatin highlighting adaptive radiations have been pivotal for macro evolutionary research on diatoms (Benoiston et al., 2017).

    • A new mechanistic understanding of light-limitation in the seagrass Zostera muelleri

      2018, Marine Environmental Research
      Citation Excerpt :

      The filtered leaf-specific transcriptome was used as the ‘background’ dataset. To identify subsets of genes which were differently expressed to light-limitation but co-expressed in a highly similar nature, ten-thousand-fold bootstrapped hierarchical clustering (Ashworth et al., 2016; Suzuki and Shimodaira, 2006) was performed. First, four hundred clusters were created by height-based tree cutting, a common practice to cluster which heuristically pre-supposes the expected number of clusters.

    View all citing articles on Scopus
    View full text