Draft genome and sequence variant data of the oomycete Pythium insidiosum strain Pi45 from the phylogenetically-distinct Clade-III

Pythium insidiosum is a unique oomycete microorganism, capable of infecting humans and animals. The organism can be phylogenetically categorized into three distinct clades: Clade-I (strains from the Americas); Clade-II (strains from Asia and Australia), and Clade–III (strains from Thailand and the United States). Two draft genomes of the P. insidiosum Clade-I strain CDC-B5653 and Clade-II strain Pi-S are available in the public domain. The genome of P. insidiosum from the distinct Clade-III, which is distantly-related to the other two clades, is lacking. Here, we report the draft genome sequence of the P. insidiosum strain Pi45 (also known as MCC13; isolated from a Thai patient with pythiosis; accession numbers BCFM01000001-BCFM01017277) as a representative strain of the phylogenetically-distinct Clade-III. We also report a genome-scale data set of sequence variants (i.e., SNPs and INDELs) found in P. insidiosum (accessible online at the Mendeley database: http://dx.doi.org/10.17632/r75799jy6c.1).


Value of the data
The first draft genome sequence of a P. insidiosum strain from the rDNA-based phylogenetic-distinct clade-III is now available.
Draft genome data of the P. insidiosum strain Pi45 will be valuable for comparative genomic studies of Pythium species and related oomycetes.
Sequence variant data (i.e., SNPs and INDELs) will be applicable for identification of the organism, genetic polymorphism analyses, genotype-phenotype association studies, and epidemiological exploration.

Data
Pythium insidiosum is a member of the oomycetes, a unique group of fungus-like microorganisms belonging to the Kingdom Stramenopiles [1]. P. insidiosum is distinguished from other oomycetes by its capacity to infect humans and animals [1][2][3]. The infectious condition called 'pythiosis' caused by this organism usually leads to life-long disability or death in affected individuals [2][3][4][5]. Genome sequence is a powerful resource that can be used to explore an organism of interest at the molecular level. Two draft genomes of the P. insidiosum strains CDC-B5653 [6] and Pi-S [7] are available in the public domain. P. insidiosum can be divided into three distinct clades: Clade-I (strains from Americas); Clade-II (strains from Asia and Australia); and Clade-III (strains from Thailand and the United States) ( Table 1; Fig. 1). The strain CDC-B5653 (labeled as Pi10) is placed in the Clade-I, whereas the strain Pi-S (labeled as Pi35) is placed in the Clade-II (Fig. 1). The genome of P. insidiosum from the distinct Clade-III, which is distantly-related to the other two clades, is lacking. Here, we report genome data of the P. insidiosum strain Pi45, isolated from a Thai patient and categorized as Clade-III (Fig. 1). We also report a genome-scale data set of sequence variants (i.e., SNPs and INDELs) found in P. insidiosum.

Identification of sequence variants
A total of 7,843,910 adaptor-removed quality-validated reads (23.3% of all reads), derived from the P. insidiosum strain Pi45, can be aligned to the reference genome of the P. insidiosum strain Pi-S [7], using the Burrows-Wheeler Alignment tool [20]. A total of 865,332 variants (i.e., SNPs and INDELs) were identified by FreeBayes [21].

Data accessibility
The draft genome sequence of the P. insidiosum strain Pi45 (also known as MCC13) has been deposited in the Data Bank of Japan (DDBJ) under the accession numbers: BCFM01000001-BCFM01017277. The sequence variant data (i.e., SNPs and INDELs) of the P. insidiosum strain Pi45 can be accessible online at the Mendeley database (http://dx.doi.org/10.17632/r75799jy6c.1).