The genome sequence of the Grey Chi, Antitype chi (Linnaeus, 1761)

We present a genome assembly from an individual male Antitype chi (the Grey Chi; Arthropoda; Insecta; Lepidoptera; Noctuidae). The genome sequence is 632.2 megabases in span. Most of the assembly is scaffolded into 31 chromosomal pseudomolecules, including the assembled Z sex chromosome. The mitochondrial genome has also been assembled and is 15.3 kilobases in length.


Background
The Grey Chi, Antitype chi, is a medium sized noctuid moth, whitish-grey background reticulated with darker markings including a distinctive black mark likened to an anvil or Greek letter χ in the forewing.These patterns camouflage it on walls and rocks, and semi-melanic forms occur, supposedly adaptive to dark substrates.The moth is monovoltine and flies in August and September in the UK, overwintering as an egg (Waring et al., 2017).
The Grey Chi is found in moorlands and grassy hillsides in the uplands.The larva appears to be quite polyphagous, feeding on leaves of various shrubs such as Crataegus and Ribes, and, in captivity, has been fed on a wide variety of low growing plants such as Rumex (Waring et al., 2017).
Antitype chi is generally common and widespread in the western Palaearctic only, from southern Scandinavia to the shores of the Mediterranean, with scattered records into Russia (GBIF Secretariat, 2022).In the UK (NBN Atlas Partnership, 2021), it is rather local, and rare or vagrant in south-eastern England, relatively common towards the north (Waring et al., 2017) with records concentrated in northern and western areas.However, populations in the UK have shown an overall decrease of 57% between 197057% between -201657% between (Fox et al., 2021)), affecting both abundance and distribution (Randle et al., 2019).
The genus Antitype (Hübner, 1821) is currently placed in the noctuid tribe Xylenini, with five congeners.It is quite close on BOLD (within about 5.5% pairwise divergence in COI-5P) to genera such as Leucochlaena Hampson, 1906 andPolymixis (Hübner, 1820) but it has apparently not yet been included in molecular phylogenetic works, so the genome of A. chi will be very useful for evolutionary studies.
The genome sequence should not only be useful in phylogeny but in studies of potentially cryptic species.There are essentially two DNA barcode clusters on BOLD (10 March 2023): the BINs BOLD:AAE7040 (most exemplars including the UK ones), and BOLD:ADF3730 (two records only from Germany and Norway which differ by just one base; these are about 1.12% pairwise divergent from BOLD:AAE7040).A closely related cluster (BOLD:AAM0440) from the Southern Europe, about 2.5% pairwise divergent from BOLD:AAE7040, is otherwise identified as A. suda (Geyer, 1832) or A. jonis (Lederer, 1865) .
For discussion of the controversy about alleged industrial melanism in the Grey Chi see (Fryer, 2013) and the website of historian Alan Brooke.
The genome of Antitype chi was sequenced as part of the Darwin Tree of Life Project, a collaborative effort to sequence all named eukaryotic species in the Atlantic Archipelago of Britain and Ireland.Here we present a chromosomally complete genome sequence for Antitype chi, based on one male specimen from Beinn Eighe National Nature Reserve, Scotland.

Genome sequence report
The genome was sequenced from one male Antitype chi (Figure 1) collected from Beinn Eighe National Nature Reserve, Scotland, UK (latitude 57.63, longitude -5.35).A total of 25-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 13 missing or mis-joins and removed three haplotypic duplications, reducing the scaffold number by 7.32%, and increasing the scaffold N50 by 1.11%.
The final assembly has a total length of 632.2 Mb in 38 sequence scaffolds with a scaffold N50 of 21.7 Mb (Table 1).Most (99.98%) of the assembly sequence was assigned to 31 chromosomal-level scaffolds, representing 30 autosomes, and the Z sex chromosome.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 2-Figure 5; Table 2).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.
The estimated Quality Value (QV) of the final reference assembly is 66.3 with k-mer based completeness of 100%, and the assembly has a BUSCO v5.3.2 completeness of 99.0% (single = 98.4%, duplicated = 0.5%), using the lepidoptera_odb10 reference set (n = 5,286).Metadata for specimens, spectral estimates, sequencing runs, contaminants and pre-curation assembly statistics can be found here.

Sample acquisition and nucleic acid extraction
A male Antitype chi (specimen number NHMUK014543795, ToLID ilAntChix2) was collected from Beinn Eighe National Nature Reserve, Scotland, UK (latitude 57.63, longitude -5.35) on 10 September 2021.The specimen was collected by David Lees (Natural History Museum) using a light trap.
The specimen was identified by the collector and dry-frozen at -80°C.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute (WSI).The ilAntChix2 sample was weighed  Fluorometer and Qubit dsDNA High Sensitivity Assay kit.
Fragment size distribution was evaluated by running the sample on the FemtoPulse system.
RNA was extracted from abdomen tissue of ilAntChix2 in the Tree of Life Laboratory at the WSI using TRIzol, according to the manufacturer's instructions.RNA was then eluted in 50 μL RNAse-free water and its concentration assessed using a Nanodrop spectrophotometer and Qubit Fluorometer using the Qubit RNA Broad-Range (BR) Assay kit.Analysis of the integrity of the RNA was done using Agilent RNA 6000 Pico Kit and Eukaryotic Total RNA assay.

Sequencing
Pacific Biosciences HiFi circular consensus DNA sequencing libraries were constructed according to the manufacturers' instructions.Poly(A) RNA-Seq libraries were constructed using the NEB Ultra II RNA Library Prep kit.DNA and RNA sequencing was performed by the Scientific Operations core at the WSI on Pacific Biosciences SEQUEL II (HiFi) and Illumina NovaSeq 6000 (RNA-Seq) instruments.Hi-C data were also generated from head and thorax tissue of ilAntChix2 using the Arima v2 kit and sequenced on the Illumina NovaSeq 6000 instrument.

Genome assembly, curation and evaluation
Assembly was carried out with Hifiasm (Cheng et al., 2021) and haplotypic duplication was identified and removed with purge_dups (Guan et al., 2020).The assembly was scaffolded with Hi-C data (Rao et al., 2014) using YaHS (Zhou et al., 2023).The assembly was checked for contamination as described previously (Howe et al., 2021).
To evaluate the assembly, MerquryFK was used to estimate consensus quality (QV) scores and k-mer completeness (Rhie et al., 2020).The genome was analysed and BUSCO scores (Manni et al., 2021;Simão et al., 2015) were generated within the BlobToolKit environment (Challis et al., 2020).Table 3 contains a list of software tool versions and sources.

Ethics and compliance issues
The

Jacqueline Heckenhauer
Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt, Hesse, Germany In their data note, the authors present the genome sequence of the primary assembly and contigs of the alternate haplotype of Antitype chi using a de novo assembly method following the approach used by the Darwin Tree of Life Project.The presented genome is of high quality and released openly for reuse.Moreover, it is useful to phylogenetic and taxonomic studies (potentially cryptic species).Therefore this high-quality reference genome is very beneficial to the field.
However, I suggest that the authors include a short summary of repetitive DNA components of the assembly and submit the library of repetitive element (RE) sequences to a public repository of repetitive elements (e.g., Dfam).This way the community would even more benefit from this data.The diversity of available insect genomes has rapidly expanded, but the rate of community contributions to RE databases which are important for RE annotation has not kept pace, preventing high-resolution study of REs in many groups.
Was the genome size of this specimen/species estimated by flow cytometry or similar?
Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Partly

Jerome H L Hui
The Chinese University of Hong Kong, Hong Kong, China Lees and colleagues report the genome sequence of a male grey chi Antitype chi (Linnaeus, 1761).This work was done as part of the Darwin Tree of Life Project which aims to sequence all named eukaryotic species in the Atlantic Archipelago of Britain and Ireland.This species of moth can be commonly found in Britain.Molecular data of this species are scarce prior to this report, and are mainly confined to COI sequences deposited to the NCBI database.This new genome resource will be very useful for further studies, and to name some, such as identifying potential cryptic species, understanding its roles in the ecosystem, and revealing its evolutionary relationships with other lepidopterans.
This genome resource is excellent from the summary statistics, with high BUSCO numbers (99.0%), high sequence continuity (scaffold N50), and majority of sequences contained on the 31 pseudochromosomes (plus mitochondrion).To sum up, this is another valuable contribution.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: I have published with Prof. Peter Holland more than 3 years ago.I confirm that this potential conflict of interest did not affect my ability to write an objective and unbiased review of the article.

Figure 2 .
Figure 2. Genome assembly of Antitype chi, ilAntChix2.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1000 size-ordered bins around the circumference with each bin representing 0.1% of the 632,218,267 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (32,898,968 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 sequence lengths (21,733,341 and 15,314,028 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the lepidoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilAntChix2.1/dataset/ CANAHZ01/snail.

Figure 3 .
Figure 3. Genome assembly of Antitype chi, ilAntChix2.1:GC coverage.BlobToolKit GC-coverage plot.Scaffolds are coloured by phylum.Circles are sized in proportion to scaffold length.Histograms show the distribution of scaffold length sum along each axis.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilAntChix2.1/dataset/CANAHZ01/blob.

Figure 4 .
Figure 4. Genome assembly of Antitype chi, ilAntChix2.1.BlobToolKit cumulative sequence plot.The grey line shows cumulative length for all scaffolds.Coloured lines show cumulative lengths of scaffolds assigned to each phylum using the buscogenes taxrule.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilAntChix2.1/dataset/CANAHZ01/cumulative.

Figure 5 .
Figure 5. Genome assembly of Antitype chi, ilAntChix2.1.Hi-C contact map of the ilAntChix2.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https:// genome-note-higlass.tol.sanger.ac.uk/l/?d=ZP2Z6P9aSIK1F9Zxa2AEKA.

Table 2 . Chromosomal pseudomolecules in the genome assembly of Antitype chi, ilAntChix2. INSDC accession Chromosome Size (Mb) GC%
materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the Darwin Tree of Life Project Sampling Code of Practice.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.All efforts are undertaken to minimise the suffering of animals used for sequencing.Each transfer of samples is further undertaken according to a Research Collaboration Agreement or Material Transfer Agreement entered into by the Darwin Tree of Life Partner, Genome Research Limited (operating as the Wellcome Sanger Institute), and in some circumstances other Darwin Tree of Life collaborators.