Hybracter v0.7.0 Benchmarking Output
Creators
Description
This dataset contains:
- The subsampled FASTQ files used to benchmarking Hybracter (https://github.com/gbouras13/hybracter).
- Benchmarking Output for Hybracter v0.7.0 vs Unicycler v0.5.0 vs Dragonflye v1.1.2 on these files.
The full benchmarking code and explanation is available https://github.com/gbouras13/hybracter_benchmarking.
The `hybracter_benchmarking_fastqs.tar.gz` tarball will contain subsampled FASTQs (gzipped) of the first 20 samples used to benchmarking `hybracter`. These are the JKD6159, Lerminiaux, Chitale and super-accuracy model basecelled simplex ATCC fastqs.
The `PRJNA1087001_ATCC_SUP_Duplex_FAST_Simplex_fastqs.tar.gz` tarball will contain subsampled FASTQs (gzipped) of the 10 added samples in v2 of the prepint used to benchmarking `hybracter`. These are the fast model basecelled simplex ATCC fastqs and super-accuracy model basecelled duplex ATCC fastqs.
The other 4 tarballs ( `hybracter_benchmarking_results_v0.7.0.tar.gz`, `hybracter_benchmarking_results_fast.tar.gz`, `hybracter_benchmarking_results_duplex.tar.gz` and `hybracter_depth_Lerminiaux_isolateB_benchmarking_results.tar.gz`) contain benchmarking outputs for the first 20 samples, 5 fast model basecelled simplex ATCC samples, 5 super-accuracy model basecelled duplex ATCC and the depth analysis for Lerminiaux isolate B.
The when untared, each tarball will contain:
- `BENCHMARKS` - contains the time etc benchmarking for each run (sample x tool)
- `DNADIFF` - contains raw chromosome Dnadiff results for each run (sample x tool)
- `DNADIFF_PARSED_OUTPUT` - contains parsed chromosome Dnadiff results for each sample
- `DNADIFF_PLASMIDS` - contains plasmid Dnadiff results for each run (sample x tool)
- `DNADIFF_PARSED_OUTPUT_PLASMID` - contains parsed plasmid Dnadiff results for each sample
- `REAL` - this contains all the actual output for each assembler. The following 5 directories will contain the all the raw output with subdirectories for each sample:
- `HYBRACTER_HYBRID_OUTPUT`
- `HYBRACTER_LONG_OUTPUT`
- `DRAGONFLYE_HYBRID_OUTPUT`
- `DRAGONFLYE_LONG_OUTPUT`
- `UNICYCLER_OUTPUT`
- Additionally, `hybracter_benchmarking_results_v0.7.0.tar.gz` will have `HYBRACTER_HYBRID_OUTPUT_REAL_BULK` - this contains the output for the 12 Lerminiaux et al isolates assembled using `hybracter hybrid` with modified config file `bulk_assemble_lerminiaux_config.yaml`.
- It will also contain a number of other subdirectories `_SUMMARIES`, `_PLASMIDS`, `_CHROMOSOMES` with parsed summary outputs and parsed specific plasmids and chromosome assemblies for Unicycler and Dragonflye (this made the assessment a lot easier and automated).
To untar e.g.
`tar -xzf hybracter_benchmarking_results_v0.7.0.tar.gz`
Files
Files
(35.2 GB)
Name | Size | Download all |
---|---|---|
md5:5b4105760fd2d12c499ed8c794af35f4
|
13.7 GB | Download |
md5:0c25c64dd756198b9078ea30c85d0344
|
1.4 GB | Download |
md5:6e44c2c3269783429588249f9406460d
|
1.5 GB | Download |
md5:1cf95c03b2ca2f6c2e9621fb813e0394
|
8.3 GB | Download |
md5:4ff5b952b95708c4fb856486b16e1916
|
7.3 GB | Download |
md5:8d96b424c071e371515d70ed8ccaa2e4
|
3.1 GB | Download |
Additional details
Dates
- Created
-
2023-11-20