The genome sequence of the Variegated Golden Tortrix, Archips xylosteana (Linnaeus, 1758)

We present a genome assembly from an individual female Archips xylosteana (the Variegated Golden Tortrix; Arthropoda; Insecta; Lepidoptera; Tortricidae). The genome sequence is 650.6 megabases in span. Most of the assembly is scaffolded into 31 chromosomal pseudomolecules, including the W and Z sex chromosomes. The mitochondrial genome has also been assembled and is 16.39 kilobases in length. Gene annotation of this assembly on Ensembl identified 19,861 protein coding genes.

Tortricidae is a large family that includes major pests, biological control agents and model Lepidoptera for the study of genetics, insect pheromones, and evolution (Regier et al., 2012;Roe et al., 2009).Archips xylosteana is a polyphagous minor pest of fruit trees (Hoebeke et al., 2008).As part of wider efforts to study Tortricidae, Archips xylosteana has been investigated to determine the presence and prevalence of naturally occurring microsporidian pathogens (e.g., Nosema spp., Pilarska et al., 2017), and their pheromone blend composition (Safonkin & Triseleva, 2008).

Genome sequence report
The genome was sequenced from one female Archips xylosteana (Figure 1) collected from Wytham Woods, Oxfordshire, UK (51.77,.A total of 41-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 7 missing joins or mis-joins, increasing the scaffold N50 by 3.31%. The final assembly has a total length of 650.6 Mb in 113 sequence scaffolds with a scaffold N50 of 22.4 Mb (Table 1).The snailplot in Figure 2 provides a summary of the assembly statistics, while the distribution of assembly scaffolds on GC proportion and coverage is shown in Figure 3.The cumulative assembly plot in Figure 4 shows curves for subsets of scaffolds assigned to different phyla.Most (99.57%) of the assembly sequence was assigned to 31 chromosomal-level scaffolds, representing 29 autosomes and the W and Z sex chromosomes.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 5; Table 2).While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.

Sample acquisition and nucleic acid extraction
A female Archips xylosteana (specimen ID Ox000680, ToLID ilArcXylo1) was collected from Wytham Woods, Oxfordshire, UK (latitude 51.77, longitude -1.34) on 2020-07-20, using a light trap.The specimen was collected and identified by Douglas Boyes (University of Oxford) and preserved on dry ice.The specimen used for Hi-C sequencing (specimen ID Ox001601, ToLID ilArcXylo2) was collected from the same         et al., 2020).The assembly was then scaffolded with Hi-C data (Rao et al., 2014) using YaHS (Zhou et al., 2023).The assembly was checked for contamination and corrected as described previously (Howe et al., 2021).
Manual curation was performed using HiGlass (Kerpedjiev et al., 2018) and Pretext (Harry, 2022).The mitochondrial genome was assembled using MitoHiFi (Uliano-Silva et al., 2023), which runs MitoFinder (Allio et al., 2020) or MITOS (Bernt et al., 2013) and uses these annotations to select the final mitochondrial contig and to ensure the general quality of the sequence.
Table 3 contains a list of relevant software tool versions and sources.

Genome annotation
The BRAKER2 pipeline (Brůna et al., 2021) was used in the default protein mode to generate annotation for the Archips xylosteana assembly (GCA_947563465.1) in Ensembl Rapid Release.

Wellcome Sanger Institute -Legal and Governance
The materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the 'Darwin Tree of Life Project Sampling Code of Practice', which can be found in full on the Darwin Tree of Life website here.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.
Further, the Wellcome Sanger Institute employs a process whereby due diligence is carried out proportionate to the nature of the materials themselves, and the circumstances under which they have been/are to be collected and provided

Kuppusamy Sivasankaran
Loyola College, Chennai, Tamil Nadu, India I appreciate the authors for the genome assembly of Archips xylosteana moth species.They have sequenced 650 megabases.Authors have assembled 31 chromosomes from the sequence using appropriate annotation softwares.They have identified 19, 861 Protein coding in the assembly.
Authors have given the common name of the species in the first sentence of the abstract.Usually, the common of the species can be given in the title and introduction part.Kindly remove the common of the species in the abstract.
The manuscript is well prepared, and it can be accepted for indexing.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Phylogenetic analysis of Noctuoidea moths I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Shixiang Zong
Beijing Forestry University, Beijing, China The manuscript written by Boyes and Gibbs provided the data of the whole genome of Archips xylosteana at chromosomal level, including the W and Z sex chromosomes.And the genes were annotated, with 19,861 protein coding genes identified.This study is a step to further our understanding of genomic research of insects.

Is the rationale for creating the dataset(s) clearly described? Yes
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 2 .
Figure 2. Genome assembly of Archips xylosteana, ilArcXylo1.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 650,610,935 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (51,300,854 bp, shown in red). .Orange and pale-orange arcs show the N50 and N90 scaffold lengths (22,404,545 and 12,542,217 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the lepidoptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/Archips%20xylosteana/dataset/CANOAX01/snail.

Figure 5 .
Figure 5. Genome assembly of Archips xylosteana, ilArcXylo1.1:Hi-C contact map of the ilArcXylo1.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=Kb-HksSUTSudYmUTC5iP3A.

Reviewer Report 27
November 2023 https://doi.org/10.21956/wellcomeopenres.22548.r70363© 2023 Zong S. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table 3 . Software tools: versions and sources. Software tool Version Open Peer Review Current Peer Review Status: Version 1
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.