ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Software Tool Article

bit: a multipurpose collection of bioinformatics tools

[version 1; peer review: 2 not approved]
PUBLISHED 31 Jan 2022
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Bioinformatics gateway.

This article is included in the Python collection.

Abstract

bit is a collection of small scripts and programs that facilitate many common tasks in bioinformatics. It operates in a Unix-like command-line environment and is comprised of bash and python code. bit is openly available on GitHub, archived with Zenodo, and is conda installable. The package is useful for users who want to do things such as manipulate fasta files, calculate GC content, quickly summarize nucleotide assemblies, easily download assemblies from NCBI just based on accessions, pull amino-acid sequences from GenBank files, calculate Shannon uncertainty for columns in multiple sequence alignments, and more. The source code is hosted on GitHub: github.com/AstrobioMike/bit

Keywords

bioinformatics, toolkit, command-line

Introduction

There are of course several great and widely used packages of bioinformatics helper programs already available. Some of these include the likes of seqtk,1 fastX-toolkit,2 and bbtools3 – all of which I use regularly and have facilitated goals I was trying to accomplish. But there are always more tasks that crop up that may not yet have a helper program or script already written to accomplish them. bit is a collection of small scripts and programs that were not written for any single piece of research work. Rather it is a collection that has been built (and is still being built) over several years. Anytime I need to write something to perform a task that has more than a one-off ad hoc use, something I end up using frequently, I consider adding it to the bit package. Some programs are light wrappers that extend and/or simplify the utility of existing software (like taxonkit4 and goatools5); many are written in Python leveraging the Biopython6 module (e.g. programs to summarize assemblies, calculate gc content, calculate Shannon uncertainty per column in multiple sequence alignments, pulling amino-acid sequences from GenBank files); and many are bash scripts to do things like download any assembly in different file formats from NCBI7 just by providing a list of wanted accessions. It is a rather random collection, but it is of convenience to many users.

Methods

Implementation

The package is written in Bash and Python (3+), and is built to run in a Unix-like environment.

Operation

bit is packaged in conda,8 which serves as its primary means of installation. All dependencies are handled by the conda installation, but they include: python v3+; biopython6 v1.7.9+; pybedtools9 v0.8.2+; GNU parallel10 v20211022+; pandas11 v1.3.4+; entrez direct12 v16.2+; taxonkit4 v0.9.0+; goatools5 v0.8.12.

Use cases

All commands are prefixed with ‘bit-’ and so can be seen by typing that and hitting tab twice. Each comes with a help menu by running the command with no arguments or with ‘-h’. In Figure 1 is an example with the program for downloading genome assemblies from NCBI by providing accessions.

3bfc1105-b074-4e40-b213-e187ee5c3258_figure1.gif

Figure 1. Example accessing the help menu and using the program for downloading genome assemblies from NCBI.

Software availability

Source code available from: https://www.github.com/AstrobioMike/bit.

Archived analysis code as at time of publication: https://doi.org/10.5281/zenodo.3383647.

License: GNU GPL v3.0.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 31 Jan 2022
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Lee M. bit: a multipurpose collection of bioinformatics tools [version 1; peer review: 2 not approved] F1000Research 2022, 11:122 (https://doi.org/10.12688/f1000research.79530.1)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 31 Jan 2022
Views
21
Cite
Reviewer Report 23 Mar 2022
Georges Hattab, Department of Mathematics and Computer Science, University of Marburg, Marburg, Hesse, 35032, Germany 
Not Approved
VIEWS 21
The author has put some effort to compile a set of useful scripts. By looking at the number of references alone, it is clear the author lacks knowledge in the field of bioinformatics as many works are left out or ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Hattab G. Reviewer Report For: bit: a multipurpose collection of bioinformatics tools [version 1; peer review: 2 not approved]. F1000Research 2022, 11:122 (https://doi.org/10.5256/f1000research.83522.r128277)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
40
Cite
Reviewer Report 22 Feb 2022
Kai Zhang, Ludwig Institute for Cancer Research, La Jolla, CA, USA 
Not Approved
VIEWS 40
The author reports a collection of scripts for various bioinformatics tasks. However, the manuscript is too short and lacks important details for me to assess the significance and novelty of this work. The rationale for this study is unclear. The ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Zhang K. Reviewer Report For: bit: a multipurpose collection of bioinformatics tools [version 1; peer review: 2 not approved]. F1000Research 2022, 11:122 (https://doi.org/10.5256/f1000research.83522.r122738)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 31 Jan 2022
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.