Method paperPan-transcriptomic analysis identifies coordinated and orthologous functional modules in the diatoms Thalassiosira pseudonana and Phaeodactylum tricornutum
Introduction
Diatoms are important primary producers in marine ecosystems (Armbrust, 2009, Tsuda et al., 2003), and their ability and capacity to physiologically adjust to changing ocean conditions are critical to their ecological success, both now and in the future. The heterokonts (including diatoms) are phylogenetically distinct from other clades, and represent relatively undiscovered genomic and physiological territory. Recent studies of diatom genomes (Armbrust et al., 2004, Bowler et al., 2008), transcriptomes and proteomes (Allen et al., 2008, Ashworth et al., 2013, Brembu et al., 2011, Carvalho et al., 2011, Chauton et al., 2013, Hook and Osborn, 2012, Kustka et al., 2014, Levitan et al., 2015, Mock et al., 2008, Nymark et al., 2009, Nymark et al., 2013, Sapriel et al., 2009, Shrestha et al., 2012, Thamatrakoln et al., 2012, Thamatrakoln et al., 2013, Valle et al., 2014) hint at complex molecular and regulatory programs that control diatom physiology and acclimation to a variety of different environmental conditions. However, the specific coordination and regulation of these molecular and adaptive processes in diatoms remain largely unknown for multiple reasons: i) the size and complexity of diatom genomes and their molecular responses to change, ii) their genetic uniqueness and lack of sufficient homology in comparison to other well-studied clades, iii) low-throughput experimental genetic approaches, and iv) insufficient data and analysis methods to identify concurrent yet distinct molecular and regulatory pathways operating simultaneously under various conditions.
The integration and analysis of data collected through many different transcriptomic experiments can be used to discover fundamentally coordinated, conditional, and distinct molecular responses that are reflective of gene regulatory processes and cannot be discovered through individual experiments (Brooks et al., 2014, Danziger et al., 2014, Reiss et al., 2006). Data-driven approaches for the discovery of molecular coordination may be particularly useful in the case of diatoms, given the size and novelty of their genomes, transcriptomes and proteomes. To systematically discover and identify modules of putatively coordinated and functionally related diatom genes, we aggregated all available microarray expression data for the model diatoms Thalassiosira pseudonana and Phaeodactylum tricornutum, and performed hierarchical clustering (Eisen et al., 1998) and motif-guided biclustering (Reiss et al., 2006) over many experimental conditions. Highlights of this analysis in the context of specific conditions and functions are discussed herein, and the complete results have been made available for further use and exploration in a web portal at the following url: http://networks.systemsbiology.net/diatom-portal/.
Section snippets
Integrated diatom transcriptomic dataset
Transcriptome-wide microarray expression data T. pseudonana used in this analysis included: silica, iron, and nitrogen limitation, low temperature and elevated pH (Mock et al., 2008), exposure to pollutant and mutagen benzo[a]pyrene (Carvalho et al., 2011), iron starvation (Thamatrakoln et al., 2012), silica re-supplementation (Shrestha et al., 2012), diel growth from exponential to stationary phase (Ashworth et al., 2013), and growth at moderate and elevated CO2 levels under moderate and
Pan-transcriptomic clustering identifies separately coordinated groups of genes within singular condition-specific responses
Through integrative, whole-genome multi-experiment clustering (Fig. 1), numerous distinctly and significantly co-expressed sets of functionally-related transcripts were discovered in T. pseudonana and P. tricornutum (Table 1, Table S1). In addition, lists of transcripts that were significantly affected in individual previous experiments were segregated into separate and distinctly coordinated groups of genes. For example, transcripts whose expression increased in silica-limited T. pseudonana
Conclusion
In this work, an integrated analysis of transcriptomic data for two model diatom species (T. pseudonana and P. tricornutum) over many independent experiments resulted in the comprehensive inference of coordinated and conditional transcriptional responses that are indicative of distinct functional and gene regulatory processes operating robustly in the cell. The clustering and analysis of combined datasets performed here allowed the partitioning of the large transcriptomes of these diatom
Acknowledgments
This work was supported by the National Science Foundation (OCB-0928561 and MCB-1316206 to M.V.O. and N.S.B.)
References (40)
- et al.
Basic local alignment search tool
J. Mol. Biol.
(1990) - et al.
Comparison of toxicity and transcriptomic profiles in a diatom exposed to oil, dispersants, dispersed oil
Aquat. Toxicol. Amst. Neth.
(2012) Activation of heat-shock genes in eukaryotes
Trends Genet.
(1985)- et al.
Whole-cell response of the pennate diatom Phaeodactylum tricornutum to iron starvation
Proc. Natl. Acad. Sci. U. S. A.
(2008) The life of diatoms in the world's oceans
Nature
(2009)- et al.
The genome of the diatom Thalassiosira pseudonana: ecology, evolution, and metabolism
Science
(2004) - et al.
Genome-wide diel growth state transitions in the diatom Thalassiosira pseudonana
Proc. Natl. Acad. Sci. U. S. A.
(2013) - et al.
Fitting a mixture model by expectation maximization to discover motifs in biopolymers
Proc. Int. Conf. Intell. Syst. Mol. Biol. ISMB Int. Conf. Intell. Syst. Mol. Biol.
(1994) - et al.
MEME suite: tools for motif discovery and searching
Nucleic Acids Res.
(2009) - et al.
Controlling the false discovery rate: a practical and powerful approach to multiple testing
J. R. Stat. Soc. Ser. B Methodol.
(1995)
The Phaeodactylum genome reveals the evolutionary history of diatom genomes
Nature
Genome-wide profiling of responses to cadmium in the diatom Phaeodactylum tricornutum
Environ. Sci. Technol.
A system-level model for the microbial regulatory genome
Mol. Syst. Biol.
Transcriptomics responses in marine diatom Thalassiosira pseudonana exposed to the polycyclic aromatic hydrocarbon benzo[a]pyrene
PLoS One
Gene regulation of carbon fixation, storage, and utilization in the diatom Phaeodactylum tricornutum acclimated to light/dark cycles
Plant Physiol.
Molecular mechanisms of system responses to novel stimuli are predictable from public data
Nucleic Acids Res.
Cluster analysis and display of genome-wide expression patterns
Proc. Natl. Acad. Sci.
DNA-binding specificities of plant transcription factors and their potential to define target genes
Proc. Natl. Acad. Sci.
The Genome Portal of the Department of Energy Joint Genome Institute
Nucleic Acids Res.
Quantifying similarity between motifs
Genome Biol.
Cited by (21)
Light-dependent metabolic shifts in the model diatom Thalassiosira pseudonana
2023, Algal ResearchThe role of antioxidant enzymes in diatoms and their therapeutic role
2022, Marine Antioxidants: Preparations, Syntheses, and ApplicationsNew paradigm in diatom omics and genetic manipulation
2021, Bioresource TechnologyCitation Excerpt :Probably 2–6% of diatom genes have been reported to be of bacterial origin acquired via horizontal gene transfer (Basu et al., 2017). Differential gene expression in diatoms is controlled either by transcription factors or regulatory elements as has been demonstrated by computational analysis of transcriptome data or by high throughput single-cell transcriptomics to quantitatively analyze cellular and physiological responses and understand interspecific functional interactions (Ashworth et al., 2016; Banerjee et al., 2016; Ku and Sebé-Pedrós, 2019). Additionally, numerous epigenetic processes like methylation of DNA and structural changes in chromatin highlighting adaptive radiations have been pivotal for macro evolutionary research on diatoms (Benoiston et al., 2017).
Exploring ‘omics’ approaches: Towards understanding the essence of stress phenomena in diatoms and haptophytes
2020, Handbook of Algal Science, Technology and MedicineA new mechanistic understanding of light-limitation in the seagrass Zostera muelleri
2018, Marine Environmental ResearchCitation Excerpt :The filtered leaf-specific transcriptome was used as the ‘background’ dataset. To identify subsets of genes which were differently expressed to light-limitation but co-expressed in a highly similar nature, ten-thousand-fold bootstrapped hierarchical clustering (Ashworth et al., 2016; Suzuki and Shimodaira, 2006) was performed. First, four hundred clusters were created by height-based tree cutting, a common practice to cluster which heuristically pre-supposes the expected number of clusters.