Plastomes of the green algae Hydrodictyon reticulatum and Pediastrum duplex (Sphaeropleales, Chlorophyceae)

Background Comparative studies of chloroplast genomes (plastomes) across the Chlorophyceae are revealing dynamic patterns of size variation, gene content, and genome rearrangements. Phylogenomic analyses are improving resolution of relationships, and uncovering novel lineages as new plastomes continue to be characterized. To gain further insight into the evolution of the chlorophyte plastome and increase the number of representative plastomes for the Sphaeropleales, this study presents two fully sequenced plastomes from the green algal family Hydrodictyaceae (Sphaeropleales, Chlorophyceae), one from Hydrodictyon reticulatum and the other from Pediastrum duplex. Methods Genomic DNA from Hydrodictyon reticulatum and Pediastrum duplex was subjected to Illumina paired-end sequencing and the complete plastomes were assembled for each. Plastome size and gene content were characterized and compared with other plastomes from the Sphaeropleales. Homology searches using BLASTX were used to characterize introns and open reading frames (orfs) ≥ 300 bp. A phylogenetic analysis of gene order across the Sphaeropleales was performed. Results The plastome of Hydrodictyon reticulatum is 225,641 bp and Pediastrum duplex is 232,554 bp. The plastome structure and gene order of H. reticulatum and P. duplex are more similar to each other than to other members of the Sphaeropleales. Numerous unique open reading frames are found in both plastomes and the plastome of P. duplex contains putative viral protein genes, not found in other Sphaeropleales plastomes. Gene order analyses support the monophyly of the Hydrodictyaceae and their sister relationship to the Neochloridaceae. Discussion The complete plastomes of Hydrodictyon reticulatum and Pediastrum duplex, representing the largest of the Sphaeropleales sequenced thus far, once again highlight the variability in size, architecture, gene order and content across the Chlorophyceae. Novel intron insertion sites and unique orfs indicate recent, independent invasions into each plastome, a hypothesis testable with an expanded plastome investigation within the Hydrodictyaceae.

The freshwater green algal family Hydrodictyaceae, a member of the Sphaeropleales and sister to the Neochloridaceae (Fučíková et al., 2014), includes the well-known genera Hydrodictyon Roth 1797 and Pediastrum Meyen 1829. The Hydrodictyaceae has undergone taxonomic revisions based on molecular phylogenetic studies of individual nuclear and chloroplast genes (Buchheim et al., 2005;McManus & Lewis, 2011); however, several relationships remain unresolved, particularly the paraphyly of Pediastrum duplex Meyen 1829 and its relationship to Hydrodictyon (McManus & Lewis, 2011). Farwagi, Fučíková & McManus (2015) presented the first complete mitochondrial genomes of four representatives from the Hydrodictyaceae. The results revealed size differences and gene rearrangements that carry phylogenetic signal, indicating that whole genome-level studies of the Hydrodictyaceae may be useful in resolving ongoing systematic questions.
To gain further insight into the evolution of the chlorophyte plastome and increase the number of representative plastomes for the Sphaeropleales, we fully sequenced the plastomes of a strain of Hydrodictyon reticulatum (L.) Bory 1824 and Pediastrum duplex. The complete plastomes of these Hydrodictyaceae strains, representing the largest of the Sphaeropleales sequenced thus far, once again highlight the variability in size, architecture, gene order and content across this order.

MATERIALS AND METHODS
Hydrodictyon reticulatum was collected from the freshwater Geyser Brook, Saratoga Co., NY, USA (43.058117, −73.807914) on 23 July 2014 and DNA was extracted directly from the field collection. A strain of Pediastrum duplex (EL0201CT/HAM0001) was isolated from the freshwater Eagleville Pond, Tolland Co., CT, USA (41.7848239, -72.2805262) in June 2002 and maintained in culture at 20 • C under a 16:8 h light:dark (L:D) cycle on agar slants. The agar slants consisted of a 50:50 mixture of Bold's basal medium (BBM) (Bold, 1949;Bischoff & Bold, 1963) and soil water prepared following McManus & Lewis (2011) in 3% agar. Voucher material for each strain is deposited in The New York Botanical Garden William and Lynda Steere Herbarium (NY) under barcodes 02334980 and 02334981, respectively. Duplicate specimens of each are deposited in the George Safford Torrey Herbarium at the University of Connecticut (CONN) and in the personal collection of HAM.

RESULTS
DNA sequence data collection resulted in 12.5 million paired-end reads for Hydrodictyon reticulatum and 9.9 million paired-end reads for Pediastrum duplex. The plastome for each strain was assembled with no gaps, and the average coverage was 195X (225,641 bp) for H. reticulatum ( Fig. 1; GenBank accession KY114065) and 134X (232,554 bp) for P. duplex ( Fig. 2; GenBank accession KY114064). Each plastome comprised two copies of an inverted repeat (IR) separated by two single-copy (SC) regions. Hydrodictyon reticulatum contained 102,823 bp and 86,226 bp SC regions and P. duplex 98,587 bp and 94,307 bp SC regions. Inferred protein translations indicated the universal genetic code was used in both plastomes, and RNA editing did not appear to be necessary. All protein-coding regions used the AUG start codon, with the exception of psbC that used GUG. The coding regions for each plastome included genes for 3 rRNAs, 25 unique tRNAs and 68 functionally identifiable protein genes, including ycf1, ycf3, ycf4 and ycf12 (Table 1). Fifty-nine putative open reading frames (orfs) ≥300 bp of unknown function were identified in the plastome of H. reticulatum and 32 were identified in the plastome of P. duplex (Table 1).
The coding region made up 59.6% of the Hydrodictyon reticulatum plastome and 53.3% of the Pediastrum duplex plastome (Table 2). Gene content of known genes was similar to that of other Sphaeroplealean plastomes, but the trnG (gcc) gene was not detected in either plastome, similar to Neochloris aquatica (Fučíková, Lewis & Lewis, 2016a). The IR in H. reticulatum was 18,296 bp and contained atpH, rrf, rrl, rrs, trnA (ugc), trnI (gau), and trnS (gcu). The IR in P. duplex was 19,830 bp and included the same genes as H. reticulatum, plus an additional four introns in rrl not found in H. reticulatum. Like other members of the Sphaeropleales, psaA was trans-spliced in both plastomes with exon 1 in the smaller SC and exons 2 and 3 in the larger SC.
Two introns were present in atpB of Pediastrum duplex. Intron 1 contained two open reading frames (orf), one with a putative reverse transcriptase, intron maturase and HNH endonuclease (orf145) and the other with a reverse transcriptase with Group II origin (orf747 ) ( Table 3). Intron 2 contained a reverse transcriptase of Group II intron origin (orf854). No introns were found in atpB of Hydrodictyon reticulatum (Table 3). Pediastrum duplex contained one intron in psaB that contained a putative GIY-YIG homing endonuclease, and both plastomes harbored an intron that lacked an orf in trnL (uaa). The psbB gene in H. reticulatum contained an intron housing a Group II intron reverse transcriptase (orf598). Three introns were present in psbA of H. reticulatum. The first The inverted repeats (IRA and IRB) which separate the genome into two single copy regions are indicated on the inner circle along with the nucleotide content (G/C dark grey, A/T light grey). Genes shown on the outside of the outer circle are transcribed clockwise and those on the inside counter clockwise. Gene boxes are color coded by functional group as shown in the key.

Table 1 List of plastid-encoded genes annotated for Hydrodictyon reticulatum and Pediastrum duplex.
Open reading frames (orfs) ≥300 bp are indicated separately for each plastome.

Gene class Genes
Ribosomal RNAs rrf x2IR rrl x2IR * in Pd rrs x2IR Transfer RNAs   Table 4). contained a putative HNH homing endonuclease (orf228). The second and third intron each harbored a reverse transcriptase with Group II intron origin (orf483 and orf602, respectively) (Table 3). Four introns were identified in rrl of P. duplex and not found in H. reticulatum. Intron 1 contained two putative site-specific DNA endonucleases (orf184, orf186 ), introns 2 and 4 each contained a LAGLIDADG superfamily homing endonuclease (orf171 and orf275, respectively); intron 3 did not contain a detectable orf (Table 3). Multiple orfs greater than 300 bp were identified outside of intron regions, some of which contained HNH homing endonucleases or intron maturase proteins similar to those found in other green algae (Table 4). A reciprocal 50% protein similarity comparison of all orfs showed that none were shared between H. reticulatum and P. duplex, with one exception. The largest in both plastomes (orf1491 in H. reticulatum and orf1819 in P. duplex) shared  -Reza et al., 2014). Both plastomes shared identical gene order, while there were extensive rearrangements when compared with the closely related Acutodesmus obliquus, Chlorotetraedron incus and Neochloris aquatica (Fig. 3). The phylogenetic analysis of gene order recovered Hydrodictyon reticulatum and Pediastrum duplex as sister lineages with bootstrap support of 100, these in turn were found sister to a clade including C. incus plus N. aquatica, also with bootstrap support of 100. Acutodesmus obliquus was recovered sister to the above-mentioned taxa with bootstrap support of 57 (Fig. 4).

DISCUSSION
The addition of the two new Hydrodictyaceae plastomes permits a more rigorous analysis of plastomes across the Sphaeropleales, and highlights the importance of increased taxon sampling to aid in understanding plastome evolutionary trends. The plastomes of Hydrodictyon reticulatum and Pediastrum duplex were considerably larger in size Hydrodictyon reticulatum KY114065

Chlorotetraedon incus KT199252
Acutodesmus obliquus DQ396875 Figure 3 Synteny map of Hydrodictyaceae with Neochloridaceae and Acutodesmus obliquus. Blocks represent regions that align to a corresponding region in another genome and colored bars within each block indicate level of sequence similarity. Lines connecting blocks indicate putative homology.  compared with sister sphaeroplealean lineages Neochloris aquatica, Chlorotetraedron incus and Acutodesmus obliquus, and represent the largest plastomes thus far reported from the Sphaeropleales ( Table 2). The size differences can be attributed to several factors, including relatively large intergenic regions (Table 2) and the infiltration of each plastome by numerous novel orfs (Table 3).
The relatively larger IR in the Hydrodictyaceae is consistent with the dynamic evolution of IRs discussed in Fučíková, Lewis & Lewis (2016b), and are larger than the ∼14 kb IR regions found in most fully-sequenced Sphaeropleales plastomes (with the exception of Kirchneriella aperta and Pseudomuriella schumacherensis), while similar to the IR found in Neochloris aquatica (∼18 kb). The IRs in Hydrodictyon reticulatum and Pediastrum duplex differ by 1,534 bp, and this difference is mainly due to the presence of four rrl introns in P. duplex. Presence and number of rrl introns across the Sphaeropleales does not appear to follow a clear phylogenetic pattern (see Fig. 1 of Fučíková, Lewis & Lewis, 2016a). This holds true for Hydrodictyaceae as well, but dense sampling within the family may uncover local phylogenetic patterns.
Intron number and distribution vary across the Sphaeropleales as well as within the Hydrodictyaceae. Of the five introns identified in Hydrodictyon reticulatum and eight introns in Pediastrum duplex, only the trnL (uaa) intron is shared by both. Six of the remaining 11 introns, one each in atpB and psaB, and two each in psbA and rrl, share identical insertion sites with other members of the order, suggesting possible ancestral origin of these introns. The last five introns have unique insertion sites in either H. reticulatum with two in psbA and one in psbB, or P. duplex with one in atpB and two in rrl. These introns with unique insertion sites could represent recent independent invasions into each plastome, a hypothesis testable with an expanded plastome investigation within the Hydrodictyaceae. The presence of a trnL (uaa) intron at base position 34 in both H. reticulatum and P. duplex is similar to other Sphaeroplealeaen plastomes, with the exception of Ankyra judayi, Mychonastes homosphaera, Mychonastes jurisii, and Ourococcus multisporus (Fučíková, Lewis & Lewis, 2016a). Based on available data, the phylogenetic distribution of this intron indicates that it is of ancestral origin and independently lost at least three times across the Sphaeropleales.
Many of the orfs 300 bp in size or larger and not located within an intron were identified as putative homing endonucleases and reverse transcriptases similar to those found in Group I and Group II introns (Table 4). The presence of these freestanding intron-like domains may indicate the translocation of in situ genic introns, or the invasion of intergenic spacer regions by novel elements (Turmel, Otis & Lemieux, 2015). Only two of the orfs (orf1819 in Hydrodictyon reticulatum and orf1491 in Pediastrum duplex) were similar to each other and not found in other sphaeroplealean plastomes, suggesting a common origin in the Hydrodictyaceae. The conserved maturase domain found in both suggests a functional importance in each plastome. The remaining orfs, 58 in H. reticulatum and 31 in P. duplex, were unique to each plastome. Because of their sister relationship in our study, we would expect to find homologous orfs if they were present in the common ancestor. Given the lack of shared orfs, it seems more likely that each lineage was independently invaded and that Hydrodictyaceae may be particularly susceptible to plastid viral infiltration. There is evidence that suggests chloroplasts are common targets of viruses (Li et al., 2016) and viral proteins have been reported in green algal plastomes of the Oedogoniales (Brouard et al., 2008), Trebouxiophyceae (Turmel, Otis & Lemieux, 2015), prasinophytes (Lemieux, Otis & Turmel, 2014;Turmel et al., 2009), andZygnematophyceae (Lemieux, Otis &Turmel, 2016). orf300 and orf432 in P. duplex are the first report of genes putatively coding viral proteins in a plastome of the Sphaeropleales. Further analyses of sphaeroplealean plastomes are necessary to determine additional occurrences, functionality and origin of these novel orfs.
Four mitochondrial genomes of Hydrodictyaceae showed structural variability similar to that seen across the order (Farwagi, Fučíková & McManus, 2015). Thus far the structure of the plastomes is conserved between Hydrodictyon reticulatum and Pediastrum duplex, though additional plastomes within the family are anticipated to shed light on intrafamilial plastome evolution. The gene-order phylogenetic analysis presented here resulted in several well-supported relationships (Fig. 4) also recovered in individual gene and phylogenomic studies (Fučíková, Lewis & Lewis, 2016a;Fučíková, Lewis & Lewis, 2016b), indicating evolutionary relationships can be recovered using genome structure for this group. Incorporating additional Hydrodictyaceae (i.e., Pseudopediastrum and Stauridium) will determine if phylogenetic signal is reflected in plastome structure within the family.

CONCLUSIONS
The plastome data reported here for two representatives from the Hydrodictyaceae, Hydrodictyon reticulatum and Pediastrum duplex, provide further insights into the evolution of plastomes in the Sphaeropleales and highlight plastome variability across the order. These plastomes represent the largest thus far sequenced from the Sphaeropleales, with the increased size being attributable to not only expansion of the IR and non-coding regions but also to infiltration of numerous novel open reading frames, many identified as putative homing endonucleases and reverse transcriptases, in both plastomes. Though both plastomes have acquired many orfs, the lack of similarity between these suggests independent acquisition in each lineage and further suggests a potential susceptibility of the hydrodictyaceaen plastome to invasion by novel elements. Phylogenetic analysis using plastome gene order in the Sphaeropleales is consistent with currently accepted phylogenetic schemes and provides an additional source of data for tree reconstruction across the order. More plastomes will need to be sequenced for the Hydrodictyaceae in order to test whether orf infiltration is common across the family or restricted to the Hydrodictyon/Pediastrum assemblage.