Draft genome sequence of Phomopsis longicolla isolate MSPL 10-6

Phomopsis longicolla is the primary cause of Phomopsis seed decay in soybean. This disease severely affects soybean seed quality by reducing seed viability and oil content, altering seed composition, and increasing frequencies of moldy and/or split beans. It is one of the most economically important soybean diseases. Here, we report the de novo assembled draft genome sequence of the P. longicolla isolate MSPL10-6, which was isolated from field-grown soybean seed in Mississippi, USA. This study represents the first reported genome sequence of a seedborne fungal pathogen in the Diaporthe–Phomopsis complex. The P. longicolla genome sequence will enable research into the genetic basis of fungal infection of soybean seed and provide information for the study of soybean–fungal interactions. The genome sequence will also be valuable for molecular genetic marker development, manipulation of pathogenicity-related genes and development of new control strategies for this pathogen.

Phomopsis longicolla is the primary cause of Phomopsis seed decay in soybean. This disease severely affects soybean seed quality by reducing seed viability and oil content, altering seed composition, and increasing frequencies of moldy and/or split beans. It is one of the most economically important soybean diseases. Here, we report the de novo assembled draft genome sequence of the P. longicolla isolate MSPL10-6, which was isolated from field-grown soybean seed in Mississippi, USA. This study represents the first reported genome sequence of a seedborne fungal pathogen in the Diaporthe-Phomopsis complex. The P. longicolla genome sequence will enable research into the genetic basis of fungal infection of soybean seed and provide information for the study of soybean-fungal interactions. The genome sequence will also be valuable for molecular genetic marker development, manipulation of pathogenicity-related genes and development of new control strategies for this pathogen.
Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/3.0/).

Pathogen isolation, identification, and pathogenicity test
Phomopsis longicolla is the primary cause of Phomopsis seed decay in soybean [1,2]. An isolate of P. longicolla, MSPL10-6, was isolated from field-grown soybean seed in Stoneville, Mississippi, USA in 2010 using the standard seed plating procedure [3]. Briefly, soybean seeds were surface-disinfected in 0.5% sodium hypochlorite for 3 min, rinsed in sterile distilled water, and then placed on acidified potato dextrose agar (APDA) medium (Difico Laboratories, Detroit, MI) that was adjusted to pH 4.8 with 25% lactic acid after autoclaving. Seeds were placed on each Petri dish and incubated at 24°C for 4-7 days [4]. Putative fungal colonies were selected and examined under a microscope. P. longicolla was identified using morphological characteristics according to Hobbs et al. [1]. Further identification was confirmed by the analysis of the ITS region of rDNA amplified by PCR with primers ITS1, 5′-TCCGTAGGTGAACCTGCGG-3′ and ITS4, 5′-TCCTCCGCTTATTGAT ATGC-3′ [5].
Pathogenicity tests were performed using a cut-seedling inoculation assay [6]. Soybean seed of a susceptible cultivar, Williams 82 was used in the tests. Mycelial plugs (4-mm in diameter) from the margin of a 10-d old culture on APDA were punched out with the large ends of disposable micropipette tips (200 μl). The micropipette tip containing the fungal mycelium was subsequently placed over a 2-wk old cut soybean stem that was cut at just below the first trifoliolate node. Micropipette tips containing plugs of non-infested APDA were served as the negative control. Two days after inoculation, micropipette tips were removed.

Data analysis and results
The mate-pair library produced 72,216,734 reads (mean length = 113 bp, insert size = 3,900) with a total of 8.2 billion bp representing 128-fold coverage. The paired-end library produced 63,763,666 reads (mean length = 97 bp, insert size = 524) with a total of 6.2 billion bp representing 97-fold coverage. A draft of the P. longicolla genome was assembled from both libraries with SOAPdenovo assembler version 2.04 [7] into 108 scaffolds of 500 bases or larger; among these, the N50 length was 1,039,102 bp, and the largest scaffold contained 6,247,470 bp. The resulting draft genome sequence was estimated to be approximately 62 Mb in size with an overall G + C content of 48.6%. Gene prediction analysis using Augustus version 2.5.5 [8] yielded a total of 15,738 predicted protein-coding regions. Predicted gene models were annotated using blastp against the UniRed100 database, 8868 (56%) of the gene models had significant matches (E value ≤ 1 e −5 ) to genes in the database. The P. longicolla sequencing and assembly statistics were summarized in Table 1.