There is a newer version of the record available.

Published May 27, 2020 | Version 0.6.0
Dataset Open

Field-wide assessment of differential HT-seq from NCBI GEO database

  • 1. University of Tartu

Description

We analyzed the field of expression profiling by high throughput sequencing, or HT-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository. Our work puts an upper bound of 62% to field-wide reproducibility, based on the types of files submitted to GEO. 

Archived dataset contains following files:

- output/parsed_suppfiles.csv,  p-value histograms, histogram classes, estimated number of true null hypotheses (pi0). 

- output/document_summaries.csv, document summaries of NCBI GEO series

- output/publications.csv, publication info of NCBI GEO series

- output/scopus_citedbycount.csv, Scopus citation info of NCBI GEO series

- output/single-cell.csv, single cell experiments

- spots.csv, NCBI SRA sequencing run metadata

- suppfilenames.txt, list of all supplementary file names of NCBI GEO submissions. One filename per row.

- suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO. One filename per row.

Files

Files (44.2 MB)

Name Size Download all
md5:acd145c3e55344e4867eded7460a7cad
44.2 MB Download