Springer Nature
Browse
12859_2022_4927_MOESM2_ESM.xlsx (9.94 kB)

Additional file 2 of Genomic data integration and user-defined sample-set extraction for population variant analysis

Download (9.94 kB)
dataset
posted on 2022-09-30, 06:15 authored by Tommaso Alfonsi, Anna Bernasconi, Arif Canakoglu, Marco Masseroli
Additional file 2. Example of transformed metadata: In this .xlsx (MS Excel) file, we list all the output metadata categories generated for each sample from the transformation of the 1KGP input datasets. The output metadata include information collected from all the four 1KGP metadata files considered. Some categories are not reported in the source metadata files—they are identified by the label manually_curated__...—and were added by the developed pipeline to store technical details (e.g., download date, the md5 hash of the source file, file size, etc.) and information derived from the knowledge of the source, such as the species, the processing pipeline used in the source and the health status. For every information category, the table reports a possible value. The third column (cardinality > 1) tells whether the same key can appear multiple times in the output GDM metadata file. This is used to represent multi-valued metadata categories; for example, in a GDM metadata file, the key manually_curated__chromosome appears once for every chromosome mutated by the variants of the sample.

Funding

H2020 European Research Council

History

Usage metrics

    BMC Bioinformatics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC