Skip to main content

Supporting data for "The Gene-Rich Genome of the Scallop Pecten maximus"

Dataset type: Genomic
Data released on March 19, 2020

Kenny NJ; Dudchenko O; James K; Betteridge E; Corton C; Dolucan J; Mead D; Oliver K; Omer A; Pelan S; Ryan Y; Sims Y; Skelton J; Smith M; Torrance J; Weisz D; Wipat A; Aiden EL; Howe K; Williams ST (2020): Supporting data for "The Gene-Rich Genome of the Scallop Pecten maximus" GigaScience Database. https://doi.org/10.5524/100726

DOI10.5524/100726

The King Scallop, Pecten maximus, is distributed in shallow waters along the Atlantic coast of Europe. It forms the basis of a valuable commercial fishery and its ubiquity means that it plays a key role in coastal ecosystems and food webs. Like other filter feeding bivalves it can accumulate potent phytotoxins, to which it has evolved some immunity. The molecular origins of this immunity are of interest to evolutionary biologists, pharmaceutical companies and fisheries management.
Here we report the genome sequencing of this species, conducted as part of the Wellcome Sanger 25 Genomes Project. This genome was assembled from PacBio reads and scaffolded with 10x Chromium and Hi-C data, and its 3,983 scaffolds have an N50 of 44.8 Mb (longest scaffold 60.1 Mb), with 92% of the assembly sequence contained in 19 scaffolds, corresponding to the 19 chromosomes found in this species. The total assembly spans 918.3 Mb, and is the best-scaffolded marine bivalve genome published to date, exhibiting 95.5% recovery of the metazoan BUSCO set. Gene annotation resulted in 67,741 gene models. Analysis of gene content revealed large numbers of gene duplicates, as previously seen in bivalves, with little gene loss, in comparison with the sequenced genomes of other marine bivalve species.
The genome assembly of Pecten maximus and its annotated gene set provide a high-quality platform for a wide range of investigations, including studies on such disparate topics as shell biomineralization, pigmentation, vision and resistance to algal toxins. As a result of our findings we highlight the sodium channel gene Nav1, known as a gene conferring resistance to saxitoxin and tetrodotoxin, as a candidate for further studies investigating immunity to domoic acid.

Additional details

Read the peer-reviewed publication(s):

  • Kenny, N. J., McCarthy, S. A., Dudchenko, O., James, K., Betteridge, E., Corton, C., Dolucan, J., Mead, D., Oliver, K., Omer, A. D., Pelan, S., Ryan, Y., Sims, Y., Skelton, J., Smith, M., Torrance, J., Weisz, D., Wipat, A., Aiden, E. L., … Williams, S. T. (2020). The gene-rich genome of the scallop Pecten maximus. GigaScience, 9(5). https://doi.org/10.1093/gigascience/giaa037 (PubMed:32352532)

Additional information:

www.dnazoo.org/assemblies/Pecten_maximus

http://dx.doi.org/10.6084/m9.figshare.10311068

Accessions (data included in GigaDB):

BioSample: SAMEA994736
BioSample: SAMN12747920
Assembly: GCA_902652985.1

Accessions (data not in GigaDB):

BioProject: PRJNA242355
BioProject: PRJNA298284
BioProject: PRJEB17629

Click on a table column to sort the results.

Table Settings
Sample ID Common Name Scientific Name Sample Attributes Taxonomic ID Genbank Name
SAMEA994736 Pecten maximus Local environmental context:wild caught [FOODON_00...
Geographic location (latitude and longitude):nor r...
Sex:hermaphrodite [PATO:0001340]
...
6579
GSM4417291Pm_mantle_a Pecten maximus Alternative accession-GEO:GSE147107
Analyte type:RNA
Tissue:mantle [UBERON:0006575]
...
6579
GSM4417292Pm_mantle_b Pecten maximus Alternative accession-BioSample:SAMN14390691
Alternative accession-BioProject:N/A
Alternative accession-GEO:GSE147107
...
6579

Click on a table column to sort the results.

Table Settings

File Name Description Sample ID Data Type File Format Size Release Date File Attributes Download
Readme TEXT 4.43 kB 2020-03-19 MD5 checksum: 77ca74ec013788a2b4b3c972bb96c562
Read quality assessment, FastQC Other HTML 1.21 MB 2020-03-19 MD5 checksum: 0ef3cd6ed153a66adb0e9667b0062806
Hard masked ('N') repeats version of genome Genome sequence FASTA 183.07 MB 2020-03-19 MD5 checksum: 8da1d89d0e722f7b45e0a635577ca605
Soft masked (lower case) repeats version of genome Genome sequence FASTA 257.62 MB 2020-03-19 MD5 checksum: 30aca34d0e00d6dda233506548df12fe
Best blast hit of all gene models - no filtering of gene models Other TEXT 10.93 MB 2020-03-19 MD5 checksum: 418ca0dabd67dd13456a72215ca81786
CDS sequence of all gene models - no filtering of gene models Coding sequence FASTA 34.79 MB 2020-03-19 MD5 checksum: 4db64cca7303978f63f70b068665c54b
Amino acid sequence of all gene models - no filtering of gene models Protein sequence FASTA 22.34 MB 2020-03-19 MD5 checksum: 1dea816233487da5264d6ef3dba9b954
GFF3 annotation file of all gene models - no filtering of gene models Annotation UNKNOWN 22.78 MB 2020-03-19 MD5 checksum: 2fca95f4e7412cf24a425c8bd27a3419
GTF annotation file of all gene models - no filtering of gene models Annotation UNKNOWN 22.63 MB 2020-03-19 MD5 checksum: 257f2d92d7ea73a470bbe3791dd96201
CDS sequence of filtered gene models - Final gene set Coding sequence FASTA 17.05 MB 2020-03-19 MD5 checksum: 26cd1409b7f9480f562464723b78c900
Funding body Awardee Award ID Comments
Natural History Museum(NHM) ST Williams SDR17012
H2020 Marie Skłodowska-Curie Actions NJ Kenny 750937
Wellcome Trust S McCarthy WT207492
National Science Foundation E Aiden PHY1427654 Physics Frontiers Center Award
Welch Foundation E Aiden Q-1866
National Institutes of Health E Aiden U01HL130010 4D Nucleome Grant
DNA Genotek (CA) E Aiden UM1HG009375
U.S. Department of Agriculture E Aiden 2017-05741 Agriculture and Food Research Initiative Grant
Marie Curie Alumni Association NJ Kenny N/A
Date Action
March 19, 2020 Dataset publish
April 8, 2020 Manuscript Link added : 10.1093/gigascience/giaa037
October 7, 2022 Manuscript Link updated : 10.1093/gigascience/giaa037
February 9, 2023 Link updated : Assembly:GCA_902652985.1