Complete Chloroplast Genome Sequence of an Engelmann Spruce (Picea engelmannii, Genotype Se404-851) from Western Canada

Engelmann spruce (Picea engelmannii) is a conifer found primarily on the west coast of North America. Here, we present the complete chloroplast genome sequence of Picea engelmannii genotype Se404-851. This chloroplast sequence will benefit future conifer genomic research and contribute resources to further species conservation efforts.

W e sequenced, assembled, and annotated the complete chloroplast genome of Engelmann spruce (Picea engelmannii, genotype Se404-851). The Engelmann spruce dominates much of the large spruce forests of interior British Columbia, where it has been reported to hybridize with Picea glauca and Picea sitchensis (1), and its range extends southward to New Mexico. The tree has three different genomes, a nuclear genome, a mitochondrial genome, and a plastid genome (i.e., chloroplast). In general, chloroplast genomes are derived from the ancestral genomes of the microbial endosymbiont from which these organelles originated (2).
A tissue sample was collected from a 13-year-old Engelmann spruce grown at the Kalamalka Forestry Centre in British Columbia (50°14=38.4ЉN, 119°16=40.8ЉW; elevation, 450 m) and planted from a seed from Don Fernando Mountain, New Mexico (36°17=60ЉN, 105°24=0ЉW; elevation, 2,987 m). Genomic DNA was extracted from 60 g tissue by Bio S&T using an organelle exclusion method yielding 300 g of high-quality purified nuclear DNA, as previously described (3). The sample was sequenced at Canada's Michael Smith Genome Sciences Centre.
To sequence the sample, a 900-bp whole-genome library was constructed following a previously described protocol (4, 5) with minor modifications. Briefly, 5 g of genomic DNA was subjected to shearing by sonication (Covaris LE220) using a duty factor of 5 and peak incident power of 450 for 70 seconds. The sonicated DNA products were fractionated in a 6% PAGE gel to recover fragments greater than 700 bp for library preparation. These PCR-free libraries were sequenced with paired-end 150-base reads on an Illumina HiSeq X platform using V4 chemistry according to the manufacturer's recommendations. With this protocol, four libraries were generated, sequencing approximately 200 million reads from each of them.
Using BLAST v2.7.1 (9), we aligned our assembly to the reference chloroplast sequence (PG29), modifying start and stop positions for consistency with previously published conifer chloroplast genomes. To ensure that there were no missing sequences at the ends of our assembly, we introduced a gap at the end, circularized the  (14) and plotted using OrganellarGenomeDRAW v1.2 (15). The inner gray circle illustrates the GC content of the genome. sequence, and ran Sealer v2.1.1 (10), closing the "end" gap and removing overlapping sequences as previously described (11). Finally, the resulting assembly was polished using Pilon v1.22 (12) using the 3-million subset of read pairs aligned with the Burrows-Wheeler Aligner (BWA) v0.1.7 (13).
The introduction of this new chloroplast genome will benefit conifer genomic research and inform future evolutionary studies.
Data availability. The complete chloroplast genome sequence of Picea engelmannii genotype Se404-851 is available under GenBank accession number MK241981, and the raw reads are available under SRA numbers SRX5070635 and SRR8252852. The annotations used as references were from Picea abies (GenBank accession number NC_021456), Picea asperata (GenBank accession number NC_032367), Picea glauca genotype PG29 (GenBank accession number NC_028594), Picea morrisonicola (GenBank accession number NC_016069), and Picea sitchensis (GenBank accession numbers NC_011152 and KU215903).

ACKNOWLEDGMENTS
This work was supported by funds from Genome Canada, Genome BC, and Genome Quebec as part of the Spruce-Up (www.spruce-up.ca) (243FOR) and AnnoVis (281ANV) projects.