The genome sequence of the Grey Pine Carpet, Thera obeliscata (Hübner, 1787)

We present a genome assembly from an individual male Thera obeliscata (the Grey Pine Carpet; Arthropoda; Insecta; Lepidoptera; Geometridae). The genome sequence is 404.7 megabases in span. Most of the assembly is scaffolded into 18 chromosomal pseudomolecules, including the Z sex chromosome. The mitochondrial genome has also been assembled and is 17.5 kilobases in length.


Background
The Grey Pine Carpet, Thera obeliscata (Hübner, 1787) is a moderate sized geometrid moth with a wingspan of about 28-36 mm.Its forewings are light to greyish brown, but variable: moths found toward the north and west of Britain, are often more reddish, with basal darker and median fasciae.The median fasciae varies in shape, but is always narrowed towards the trailing edge, and with its inner margin only slightly angled inward, unlike in the otherwise similar Pine Carpet T. firmata Hübner, 1822.It can be hard to distinguish from other Thera species.Morphologically, the diagnostic external difference is the male antenna, which is simple in T. obeliscata, and slightly serrate in T. britannica (Turner, 1925) (Boyes et al., 2023;Lewis, 2013).
The species is bivoltine in Britain, with flight periods between late April and mid-July, as well as between late August and early November (Randle et al., 2019).It overwinters as a larva and pupates underground (Waring et al., 2017).The Grey Pine Carpet occurs in coniferous forest as well as parks and gardens (Waring et al., 2017).The larva feeds on many conifers such as Scots Pine, Norway Spruce, Western Hemlock-spruce, Western Red-cedar, Lawson's Cypress, Monterey Cypress and Douglas fir (Waring et al., 2017).T. obeliscata is a generally common or locally abundant species that is widespread in the British Isles and Channel Isles.It is widespread in the western Palaearctic including Scandinavia, but avoiding the south of Spain, with scattered records in non-oriental Asia (GBIF Secretariat, 2023).
There are conflicting reports of population changes of Grey Pine Carpet in the UK.Despite the increase in conifer planting since the 19th century (Waring et al., 2017), the species is reported to have decreased in abundance by a substantial 47% between 1970 and 2016, and has decreased by 9% in distribution range.This is in sharp contrast to the recently colonising Spruce Carpet T. cupressata (Geyer, 1831) (Randle et al., 2019); whereas between 1968 and 2006, Rothamsted trap numbers of T. obeliscata increased on average annually by of about 0.5% (Conrad et al., 2006).
In Britain, T. obeliscata exhibits a single mitochondrial cluster on BOLD, BOLD:AAA7522 (25/02/2023), with up to about 1.28% intraspecific divergence reported by BOLD in Europe.This Barcode Index Number is shared with T. britannica (Turner, 1925), a species which nevertheless comprises a separate haplogroup on BOLD including UK examples still (mis-)identified as T. variata (Denis & Schiffermüller, 1775), and also encompasses hybrids with T. variata.BOLD:AAA7521 is a related cluster found in mainland Europe but not yet the UK, but currently it has also mixed identities comprising T. obeliscata, T. cembrae and T. variata (as of 25/08/2023).The genome sequence will be useful in comparison with that of its potential sister species of T. obeliscata, T. britannica, whose genome is available (Boyes et al., 2023), considering that their mitogenomes are only 1.37% pairwise divergent by measure of 658 bp of COI-5P (OX387930 vs OW618053 respectively).It will also be of use in future molecular-aided taxonomic work, in studying apparently recently separated sibling species, and also in studies of hybridisation (Boyes et al., 2023).Thera Stephens, 1831 is a member of the larentiine tribe Cidariini (in which Õunap et al. (2016) place Thera as sister to Pennithera Viidalepp, 1980 (see also Choi, 2000).

Genome sequence report
The genome was sequenced from one male Thera obeliscata (Figure 1) collected from Beinn Eighe National Nature Reserve, Scotland, UK (57.63,.A total of 57-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 19 missing joins or mis-joins and removed 2 haplotypic duplications, reducing the assembly length by 0.65% and the scaffold number by 11.54%. The final assembly has a total length of 404.7 Mb in 45 sequence scaffolds with a scaffold N50 of 25.3 Mb (Table 1).Most (95.8%) of the assembly sequence was assigned to 18 chromosomal-level scaffolds, representing 17 autosomes and the Z sex chromosome.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2-Figure 5; Table 2).The karyotype of Thera obeliscata in this sample does not match the expected karyotype of 13 (Suomalainen, 1965).While not fully  phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.

Sample acquisition and nucleic acid extraction
The Thera obeliscata specimens used in this study were collected from Beinn Eighe National Nature Reserve, Scotland, UK (latitude 57.63, longitude -5.35) on 2021-09-09 and 2021-09-10, using a light trap.The specimens were collected and identified by David Lees (Natural History Museum) and were dry frozen (-80°C).One of these specimens (specimen ID NHMUK014543806, individual ilTheObel3) was used for DNA sequencing, and another (specimen ID NHMUK014543809, individual ilTheObel2) was used for Hi-C data.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute (WSI).The ilTheObel3 sample was weighed and dissected on dry ice.Head and thorax tissue was disrupted using a Nippi Powermasher fitted with a BioMasher pestle.High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit.HMW DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with speed setting 30.Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA sample.The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit.
Fragment size distribution was evaluated by running the sample on the FemtoPulse system.A Hi-C map for the final assembly was produced using bwa-mem2 (Vasimuddin et al., 2019) in the Cooler file format (Abdennur & Mirny, 2020).To assess the assembly
Table 3 contains a list of relevant software tool versions and sources.

Wellcome Sanger Institute -Legal and Governance
The materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the 'Darwin Tree of Life Project Sampling Code of Practice', which can be found in full on the Darwin Tree of Life website here.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.
Further, the Wellcome Sanger Institute employs a process whereby due diligence is carried out proportionate to the nature of the materials themselves, and the circumstances under which they have been/are to be collected and provided for use.The purpose of this is to address and mitigate any potential legal and/or ethical implications of receipt and use of the materials as part of the research project, and to ensure that in doing so we align with best practice wherever possible.The overarching areas of consideration are: • Ethical review of provenance and sourcing of the material

Jerome Hui
The Chinese University of Hong Kong, Hong Kong, Hong Kong Lees and colleagues report the genome sequence of a male grey pine carpet Thera obeliscata (Hubner, 1787).This moth species can be commonly found in Britain.Molecular data of this species are mainly confined to COI sequences deposited to the NCBI database, and it is good to begin to see these information are included and written in the background of the report (as well as with certain level of comparison to other genomes generated by the Darwin Tree of Life).This new genome resource will be useful for further studies, and to name some, such as revealing its population structure and relationship with the host plants, understanding the effect of climate change and/or anthropogenic activities on them, as well as depicting their evolutionary relationships with other lepidopterans.
This genome resource is excellent from the summary statistics, with high BUSCO numbers (98.3%), high sequence continuity (scaffold N50), and majority of sequences contained on the 18 pseudochromosomes (plus mitochondrion).To sum up, this is another valuable contribution by the Darwin Tree of Life.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: I have published with Prof. Peter Holland more than 3 years ago.I confirm that this potential conflict of interest did not affect my ability to write an objective and unbiased review of the article.
Reviewer Expertise: Genomics, evolution, invertebrates I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Paul Frandsen
Brigham Young University-Idaho, Rexburg, Idaho, USA This is a nice contribution of a species that is widespread, but has nevertheless experienced recent declines in population numbers.It also appears that the species boundaries are a bit unclear and a genome may help with those determinations.My only critique is that the discussion on the mitochondrial divergence between the grey carpet and its potential sister species could be made a bit more clear.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Biodiversity genomics, bioinformatics, systematics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 2 .
Figure 2. Genome assembly of Thera obeliscata, ilTheObel3.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 404,701,398 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (29,171,840 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 scaffold lengths (25,318,533 and 14,367,994 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the lepidoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilTheObel3.1/dataset/CANPUM01/snail.

Figure 3 .
Figure 3. Genome assembly of Thera obeliscata, ilTheObel3.1:BlobToolKit GC-coverage plot.Scaffolds are coloured by phylum.Circles are sized in proportion to scaffold length.Histograms show the distribution of scaffold length sum along each axis.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilTheObel3.1/dataset/CANPUM01/blob.

Figure 4 .
Figure 4. Genome assembly of Thera obeliscata, ilTheObel3.1:BlobToolKit cumulative sequence plot.The grey line shows cumulative length for all scaffolds.Coloured lines show cumulative lengths of scaffolds assigned to each phylum using the buscogenes taxrule.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilTheObel3.1/dataset/CANPUM01/cumulative.

Figure 5 .
Figure 5. Genome assembly of Thera obeliscata, ilTheObel3.1:Hi-C contact map of the ilTheObel3.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=TUsvOtGYTDKCHidtk4PFTQ.

Reviewer Report 06
November 2023 https://doi.org/10.21956/wellcomeopenres.22151.r67455© 2023 Frandsen P.This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Open Peer Review Current Peer Review Status: Version 1
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.