The genome sequence of the lesser treble-bar moth, Aplocera efformata (Guenée, 1857)

We present a genome assembly from an individual female Aplocera efformata (the lesser treble-bar; Arthropoda; Insecta; Lepidoptera; Geometridae). The genome sequence is 350 megabases in span. Most of the assembly (99.97%) is scaffolded into 32 chromosomal pseudomolecules, with W and Z sex chromosomes assembled. The complete mitochondrial genome was also assembled and is 15.4 kilobases in length. Gene annotation of this assembly on Ensembl has identified 11,393 protein coding genes.


Background
The lesser treble-bar, Aplocera efformata (Gueneé, 1857), is a geometer moth within the subfamily Larentiinae (family Geometridae) composed of carpets, pugs and allies (Waring & Townsend, 2017).It is hard to distinguish from its sister species, the treble-bar (Aplocera plagiata), as both are grey with three dark cross-bands in their pointed forewings.However, the lesser treble-bar species is slightly smaller, with a forewing length of 16-19 mm, and displays less intense dark cross-bands and lighter forewings.Its abdomen also has a shorter taper to the apex compared to the very pointed abdomen of the treble-bar (Townsend et al., 2010;Waring & Townsend, 2017).
The lesser treble-bar's range extends from Morocco across southern and central Europe, reaching Anatolia to the east and southern Scandinavia to the north (Bálint et al., 2016).
The preferred habitat of A. efformata is hot, dry grasslands, mainly on sandy or calcareous ground, though it is sometimes encountered in regions such as sea-cliffs, woodland rides, abandoned quarries, field margins and gardens.A. efformata presents two generations of flight seasons, which are easily disturbed by day, overwinters as larvae and pupates underground (Bálint et al., 2016;Waring & Townsend, 2017).
In Europe, the species has been suffering a decline in population, being threatened by the diminution of their favoured habitat (Bálint et al., 2016).We predict that the Darwin Tree of Life assembly presented here will be an important tool for further examination of its population dynamics.

Genome sequence report
The genome was sequenced from a single female A. efformata (Figure 1) collected from Wytham Woods, Berkshire, UK (latitude 51.772, longitude -1.338).A total of 53-fold coverage in Pacific Biosciences single-molecule circular consensus (HiFi) long reads and 128-fold coverage in 10X Genomics read clouds were generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected four misjoins which reduced the scaffold number by 7.27%.
The final assembly has a total length of 349 Mb in 51 sequence scaffolds with a scaffold N50 of 12.5 Mb (Table 1).Most of the assembly sequence (99.97%) was assigned to 32 chromosomal-level scaffolds, representing 30 autosomes (numbered by sequence length) and the W and Z sex chromosomes (Figure 2-Figure 5; Table 2).

Genome annotation report
The GCA_921293045.1 genome was annotated using the Ensembl rapid annotation pipeline (Table 1).The resulting annotation includes 19,297 transcribed mRNAs from 11,393 protein-coding and 1,074 non-coding genes.

Sample acquisition and nucleic acid extraction
A single female A. efformata specimen (ilAplEffo1) was collected in Wytham Woods, Berkshire, UK (latitude 51.772, longitude -1.338) by Douglas Boyes (University of Oxford), using a light trap.The sample was identified by Douglas Boyes and snap-frozen on dry ice.
DNA was extracted at the Tree of Life laboratory, Wellcome Sanger Institute.The ilAplEffo1 sample was weighed and dissected on dry ice with head tissue set aside for Hi-C sequencing.Thorax tissue was disrupted using a Nippi Powermasher fitted with a BioMasher pestle.Fragment size analysis of 0.01-0.5 ng of DNA was then performed using an Agilent FemtoPulse.High molecular weight (HMW) DNA was extracted using the Qiagen MagAttract HMW DNA extraction kit.Low molecular weight DNA was removed from a 200 ng aliquot of extracted DNA using 0.8X AMpure XP purification kit prior to 10X Chromium sequencing; a minimum of 50 ng DNA was submitted for 10X sequencing.HMW DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with speed setting 30.Sheared DNA was purified by solid-phase reversible immobilisation using AMPure PB beads with a 1.8X ratio of beads to sample to remove the shorter fragments and concentrate the DNA.The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit.Fragment size distribution was evaluated by running the sample on the FemtoPulse system.
RNA was extracted from the abdomen tissue of ilAplEffo1 in the Tree of Life Laboratory at the WSI using TRIzol, according to the manufacturer's instructions.RNA was then eluted in 50 μl RNAse-free water and the RNA concentration was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer using the Qubit RNA Broad-Range (BR) Assay kit.Analysis of the integrity of the RNA was done using Agilent RNA 6000 Pico Kit and Eukaryotic Total RNA assay.

Sequencing
Pacific Biosciences HiFi circular consensus and 10X Genomics read cloud DNA sequencing libraries were constructed according to the manufacturers' instructions.Poly(A) RNA-Seq libraries were constructed using the NEB Ultra II RNA Library Prep kit.DNA and RNA sequencing was performed by

Genome annotation
The Ensembl gene annotation system (Aken et al., 2016) was used to generate annotation for the A. efformata assembly (GCA_921293045.1).Annotation was created primarily through alignment of transcriptomic data to the genome, with gap filling via protein to-genome alignments of a select set of proteins from UniProt ((UniProt Consortium, 2019)).The genome sequence of the lesser treble-bar moth, Aplocera efformata (Gueneé, 1857)" stands out as a well-executed genomic study, providing valuable insights into the lesser treble-bar moth's genetic makeup.The meticulous methodology, detailed annotations, and transparent data sharing make this article a significant resource for researchers in genomics, evolutionary biology, and entomology.

Data availability
Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Genomics, Phylogenomics, Mitogenomics, and Evolutionary Biology I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Niklas Wahlberg
Lund university, Lund, Sweden The report on the genome of Aplocera efformata (Lepidoptera: Geometridae) follows the standard format for genome notes, of which more than 200 have been published already for butterflies and moths.This is an amazing resource for the scientific community!In this particular case, the genome of A. efformata will be interesting to compare to the closely related A. plagiata, from which it is difficult to distinguish (as mentioned in the article).
The methods section has been written clearly, and it is good that the format has developed, with all the programs being listed along with the version that was used for assembling the genome.I have nothing more to add, it seems that Lepidoptera genomes are easy to sequence and easy to assemble.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes Are sufficient details of methods and materials provided to allow replication by others?Yes Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: I am a founding member of the Psyche project, which aims to sequence the genomes of all Lepidoptera in Europe.
Reviewer Expertise: Phylogenomics, systematics, Lepidoptera, Geometridae I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.The genome release of the lesser treble-bar moth by Douglas Boyes and the DToL crew is another nice addition to the growing number of Lepidopteran reference genomes.Also, the presented species is not that trivial and could offer some interesting insights into taxonomy and biology of these moths.
The provided metrics seem in agreement with what is required from reference genomes, but my understanding of the bioinformatic work is not sufficient to evaluate the used approach further.Instead, I try to contribute some thoughts for the taxonomic and faunistic parts of the report.

Background
You could point out that A. efformata is a monophagous species on Hypericum.This might be relevant, as monophagy vs polyphagy could be somehow reflected by the insect genomes (adaptations to plant secondary compounds etc).Perhaps something to address in the future?If you have some speculation to offer about this (adaptive signatures, specialized detoxifying enzymes etc), please feel free to add a couple of sentences at the end of the section.

Methods:
The trapping date is not provided.
A representative of the heterogametic sex was analyzed, which is great as it is not trivial with night active moths (males more often caught on light).However, please assign a voucher specimen for the genome and provide details of its location (name of the collection, storing institute storing) for future reference.
I would have chosen thorax for RNA as there are probably more tissue types present.Moreover, the female abdomen is often full of eggs (note that the specimen looks relatively young, wings and scales intact, so probably has not laid its eggs yet) that mainly have a limited set of maternal RNAs for the early embryonic development.Was this reflected by the RNA-seq data?If so, comment this in the genome annotation report.
As pointed out in the background chapter, A. efformata is very similar to plagiata.While I trust the skill of your taxonomic expert, I nevertheless extracted the Co1 sequence from the mitochondrial genome you provided and it seems to agree with efformata.The authors could point out that the species identity can be confirmed in this way using DNA-barcoding (95.5 % pairwise identity between the Co1 sequence from the two species, so a nice separation).It would also not harm to mention here the BINs for the different Aplocera species in BOLD for comparison.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Partly
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Molecular biology, molecular ecology and genetics, taxonomy, DNA barcoding I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 1 .
Figure 1.Image of the female Aplocera efformata specimen from which the genome was sequenced.The ilAplEffo1 specimen was used to generate Pacific Biosciences, 10X genomics, Hi-C and RNA-Seq data.

Figure 2 .
Figure 2. Genome assembly of Aplocera efformata, ilAplEffo1.1:metrics.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 349,498,550 bp assembly.The distribution of chromosome lengths is shown in dark grey with the plot radius scaled to the longest chromosome present in the assembly (19,009,616 bp, shown in red).Orange and pale-orange arcs show the N50 and N90 chromosome lengths (12,527,553 and 7,930,759 bp), respectively.The pale grey spiral shows the cumulative chromosome count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the lepidoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/ilAplEffo1.1/dataset/CAKLCP01.1/snail.

Figure 5 .
Figure 5. Genome assembly of Aplocera efformata, ilAplEffo1.1:Hi-C contact map.Hi-C contact map of the ilAplEffo1.1 assembly, visualised in HiGlass.Chromosomes are arranged in size order from left to right and top to bottom.The interactive Hi-C map can be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=DFH-u6PYSz6oMane3MisUg.

Reviewer Report 28
March 2023 https://doi.org/10.21956/wellcomeopenres.20620.r55797© 2023 Pohjoismäki J.This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Jaakko L.O.Pohjoismäki Department of Environmental and Biological Sciences, University of Eastern Finland, Joensuu, Finland