The genome sequence of Gari tellinella (Lamarck, 1818), a sunset clam

We present a genome assembly from an individual Gari tellinella (Mollusca; Bivalvia; Cardiida; Psammobiidae). The genome sequence is 1,598 megabases in span. The majority of the assembly (99.85%) is scaffolded into 19 chromosomal pseudomolecules. The complete mitochondrial genome was also assembled and is 18.5 kilobases in length.


Background
Gari tellinella belongs to a group of bivalves known as the sunset clams (Psammobiidae).This small (up to 26 mm), thin-shelled, often colourful bivalve is found in silty coarse sands and shell gravel from the intertidal to shelf depths around most of the UK, except the southern North Sea.Its range extends from northern France through to Norway and Sweden, but sporadic records are also noted from northwestern and southern Spain and a few from the Mediterranean.
Flattened, elongated oval in outline, externally G. tellinella is sculpted by concentric lines and clear growth stops and bears a smooth outer shell margin.Cream, pale brown, orange, purple or white in colour, it frequently has umbonal rays of orange, red or white that extend to the outer margin.Internally, G. tellinella can be orange, mustard yellow or white, again, with umbonal rays visible as white or reddish orange.The pallial sinus that runs between the two adductor muscles is half the length of the shell and narrowly curved; the bottom half is confluent with the pallial line.
Gari tellinella is most similar to Gari depressa, but is a third of the size and has a narrowly rounded posterior margin, which compares to the almost truncate posterior of G. depressa.There is also no posterior gape between the valves in G. tellinella, but a small gape in G. depressa.

Genome sequence report
The genome was sequenced from a single G. tellinella collected from Jennycliff Bay, Plymouth Sound, Plymouth, UK (Figure 1).A total of 34-fold coverage in Pacific Biosciences single-molecule HiFi long reads and 25-fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.
Manual assembly curation corrected 116 missing/misjoins and removed 11 haplotypic duplications, reducing the assembly size by 0.40% and the scaffold number by 36.72%, and increasing the scaffold N50 by 1.61%.
The final assembly has a total length of 1,598 Mb in 81 sequence scaffolds with a scaffold N50 of 85.3 Mb (Table 1).The majority, 99.85%, of the assembly sequence was assigned to 19 chromosomal-level scaffolds, representing 19 autosomes (numbered by sequence length) (Figure 2-Figure 5; Table 2).Chromosome 9 consists of a mosaic of haplotypes that is arranged into one of two valid structural possibilities.
The assembly has a BUSCO v5.1.2 (Manni et al., 2021) completeness of 79.6% (single 78.4%, duplicated 1.2%) using the mollusca_odb10 reference set (n=5294).However, we believe that this relatively low BUSCO score is a result of limitations with the current mollusca_odb10 geneset.Using the metazoa_odb10 reference set (n=954), the assembly has a completeness of 94.9% (single 94.2%, duplicated 0.7%), which we believe is evidence of high completeness.While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.

Sample acquisition and DNA extraction
A single G. tellinella specimen (xbGarTell2) was collected by boat from sand in Jennycliff Bay, Plymouth Sound, Plymouth, UK (latitude 50.3394, longitude -4.1311) by Teresa Derbyshire (National Museum of Wales), Mitchell Brennan, Sean McTierney and Allison Small (Marine Biological Association).The specimen was identified by Anna Holmes (National Museum of Wales) and snap-frozen in liquid nitrogen.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute.The xbGarTell2 sample was weighed and dissected on dry ice with tissue set aside for Hi-C and RNA sequencing.Tissue was disrupted using a Nippi Powermasher fitted with a BioMasher pestle.Fragment size analysis of 0.01-0.5 ng of DNA was then performed using an Agilent  FemtoPulse.High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit.Low molecular weight DNA was removed from a 200-ng aliquot of extracted DNA using 0.8X AMpure XP purification kit prior to 10X Chromium sequencing; a minimum of 50 ng DNA was submitted for 10X sequencing.HMW DNA was sheared into an average fragment size between 12-20 kb in a Megaruptor 3 system with speed setting 30.Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample.The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit.Fragment size distribution was evaluated by running the sample on the FemtoPulse system.

Sequencing
Pacific Biosciences HiFi circular consensus and 10X Genomics Chromium read cloud sequencing libraries were constructed according to the manufacturers' instructions.Sequencing was performed by the Scientific Operations core at the Wellcome Sanger Institute on Pacific Biosciences SEQUEL II (HiFi) and Illumina NovaSeq 6000 (10X) instruments.Hi-C data were generated in the Tree of Life laboratory from remaining tissue of xbGarTell2 using the Arima v2 kit and sequenced on a NovaSeq 6000 instrument.

Chase Smith
University of Texas, Austin, USA This study will be integral toward further improving bivalve genomic resources and provides a high quality genome for Gari tellinella.I think the authors have done a nice job presenting the study and I only have a few minor questions/suggestions before indexing.
In Figure 2, you present the molluscan lineage BUSCO report.I agree this has issues since it is extremely limited.I would recommend replacing it with the metazoan report since it is probably a better evaluation of genome completeness.

○
In the methods under sample acquisition: "Hi-C and RNA sequencing".Assuming you mean DNA?

○
The contact map is fantastic.Many researchers (including myself) have had extensive difficulties with generating libraries for chromatin sequencing (e.g., HiC, OmniC).If there were any specifics that were done to ensure proper digestion and library preparation it would be useful to add them to the Methods.

○
Any reason you did not provide an annotation for the genome?I do not think it is a necessity given the quality of this assembly but would greatly increase the use of the data.
Although there is no RNAseq generated in this study available, proteins from recent bivalve genome assemblies could be used as a training dataset.

Chase Smith
Is the rationale for creating the dataset(s) clearly described?Yes

Are the protocols appropriate and is the work technically sound? Partly
Are sufficient details of methods and materials provided to allow replication by others?

Are the datasets clearly presented in a useable and accessible format? Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Bivalve genomics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
The reported genome resource has the quality to be accepted in WOR.All methods and analysis conducted are well addressed and reflect the state of the art to produce a genome assembly at chromosome level.With respect to the BUSCO score, there is a limitation with the current version for mollusca_odb10 genset.Thus, I totally expected the completeness reported.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Marine genomics and molecular ecology I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

David Osca
Instituto Universitario de Estudios Ambientales y Recursos Naturales.Faculty of Marine Sciences.Department of Biology, University of Las Palmas de Gran Canaria, Las Palmas de Gran Canaria, Canary Islands, Spain First and foremost, I would like to extend my congratulations to the authors for their commendable efforts in working with genomes and their significant contribution through the publication of a new mollusk genome.However, I would like to offer some suggestions for improving the article, as I believe they are necessary: i) In the "Background" section, it would be beneficial to include citations to support the information provided about the species and its distribution.This would help clarify if the descriptions and data presented have been obtained from previous publications.
ii) In the "Genome Sequence Report" and "Methods" sections, while the specific organism that has been sequenced is described, the article does not mention the sex of the individual.Although this information is typically not crucial, in the case of Bivalvia, where doubly uniparental inheritance (DUI) occurs, it is important to indicate whether the individual studied was male or female.This is particularly relevant due Psammobiidae sister groups show DUI (Smith et al., 2023) 1 .Including this information would help avoid potential conflicts in the interpretation of the obtained data.
iii) It is essential to provide the voucher code assigned to the specimen studied, as well as any relevant information regarding the DNA or leftover tissue associated with the study.iv) Consider using a black background, including a scale, and ensuring the correct orientation of the bivalve is depicted.These modifications will enhance the clarity and professionalism of the figure 1.
v) The article explains the number of chromosomes and autosomes, but there is no mention of the annotation of the mitogenome, such as the number of genes or a figure depicting their arrangement.It would be helpful to include this information to provide a comprehensive understanding of the genome.
vi) Although the size of the mitogenome is mentioned in the abstract, it is not explicitly stated in the main text of the article.While Table 2 makes a vague reference to the data (0.02 megabases), the abstract states a more specific size (18.5 kilobases).To ensure consistency and clarity, it is advisable to include the precise size of the mitogenome within the main body of the article.
In conclusion, I would like to once again commend the authors for their remarkable work and valuable contribution to the field.By implementing these suggested improvements, the article will become more comprehensive.Reviewer Expertise: Mollusca, marine macroinvertebrates, genetics, genomics, biodiversity, evolution, phylogenetic studies, barcoding.
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Figure 1 .
Figure 1.Images of the Gari tellinella specimen taken during preservation and processing.

Figure 2 .
Figure 2. Genome assembly of Gari tellinella, xbGarTell2.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 1,597,633,241 bp assembly.The distribution of chromosome lengths is shown in dark grey with the plot radius scaled to the longest chromosome present in the assembly (121,778,318 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 chromosome lengths (85,279,272 and 67,439,623 bp), respectively.The pale grey spiral shows the cumulative chromosome count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the mollusca_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/xbGarTell2.1/dataset/CAKLPP01.1/snail.

Figure 5 .
Figure 5. Genome assembly of Gari tellinella, xbGarTell2.1:Hi-C contact map.Hi-C contact map of the xbGarTell2.1 assembly, visualised in HiGlass.Chromosomes are arranged in size order from left to right and top to bottom.The interactive Hi-C map can be viewed here.

Reviewer Report 02
June 2023 https://doi.org/10.21956/wellcomeopenres.19711.r57012© 2023 Osca D. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

References 1 .
Smith CH, Pinto BJ, Kirkpatrick M, Hillis DM, et al.: A tale of two paths: The evolution of mitochondrial recombination in bivalves with doubly uniparental inheritance.J Hered.2023; 114 (3): 199-206 PubMed Abstract | Publisher Full Text Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes Are sufficient details of methods and materials provided to allow replication by others?Yes Are the datasets clearly presented in a useable and accessible format?Partly Competing Interests: No competing interests were disclosed.

Table 3 . Software tools used.
The genome sequence is released openly for reuse.The G. tellinella genome sequencing initiative is part of the Darwin Tree of Life (DToL) project.All raw sequence data and the assembly have been deposited in INSDC databases.The genome will be annotated and presented through the Ensembl pipeline at the European Bioinformatics Institute.Raw data and assembly accession identifiers are reported in Table1.
Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.Each transfer of samples is further undertaken according to a Research Collaboration Agreement or Material Transfer Agreement entered into by the Darwin Tree of Life Partner, Genome Research Limited (operating as the Wellcome Sanger Institute), and in some circumstances other Darwin Tree of Life collaborators.

Open Peer Review Current Peer Review Status: Version 1
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.