A reference genome, mitochondrial genome and associated transcriptomes for the critically endangered swift parrot ( Lathamus discolor)

Abstract* The swift parrot ( Lathamus discolor) is a Critically Endangered migratory parrot that breeds in Tasmania and winters on the Australian mainland. Here we provide a reference genome assembly for the swift parrot. We sequence PacBio HiFi reads to create a high-quality reference assembly and identify a complete mitochondrial sequence. We also generate a reference transcriptome from five organs to inform genome annotation. The genome was 1.24 Gb in length and consisted of 847 contigs with a contig N50 of 18.97 Gb and L50 of 20 contigs. This study provides an annotated reference assembly and transcriptomic resources for the swift parrot to assist in future conservation genomic research.


Introduction
The swift parrot (Lathamus discolor) is a migratory parrot that breeds on the eastern seaboard of the island of Tasmania, Australia and winters on southeastern mainland Australia (Kennedy & Tzaros, 2005;MacNally & Horrocks, 2000;Saunders & Heinsohn, 2008).The swift parrot is Critically Endangered (BirdLife International, 2018) due to the combined effects of logging of its important breeding habitat (Webb et al., 2019) and the impacts of an introduced predator, the sugar glider (Petaurus breviceps) (Heinsohn et al., 2015).Population viability analysis has shown that the already small population of only a few hundred swift parrots (Olah et al., 2021) is likely to rapidly decline over coming generations (Heinsohn et al., 2015;Owens et al., 2023) Although the species has already been subject to population genetic study (Olah et al., 2021;Stojanovic et al., 2018), there remain outstanding questions about multiple aspects of the species' genetic ecology.For example, like other parrots with small population sizes (Morrison et al., 2020), understanding the genetic basis of immune competence is critical for managing demographic impacts of disease in swift parrots (Saunders & Tzaros, 2011).To facilitate detailed genomic research on this species, we sequenced DNA with PacBio long reads to generate a draft reference assembly and sequenced RNA from five tissues to provide transcriptomic resources to assist in genome annotation for the swift parrot.

Sample collection and DNA/RNA extraction
A single captive bred female swift parrot died as a result of liver infection.Tissue samples were dissected and flash frozen at -80°C or preserved in RNAlater before being frozen at -80°C.High molecular weight (HMW) DNA was then extracted from heart and kidney tissue using the Nanobind Tissue Big DNA Kit v1.0 (Circulomics: SKU 102-302-100) using the standard protocol.A Qubit fluorometer was used to assess the concentration of DNA with the Qubit dsDNA BR assay kit (Thermo Fisher Scientific).Total RNA was extracted from gonad, spleen, liver, heart and kidney using the RNeasy Plus Mini Kit (Qiagen: 74134) with RNAse-free DNAse I set (Qiagen: EN0521) using the standard protocol.RNA quality was determined using the NanoDrop (Thermo Fisher Scientific) and RNA integrity (RIN) score determined using the Bioanalyzer RNA nano 6000 kit (Agilent 2100: 5067-1511).
Library construction and sequencing HMW DNA was sent for Pacific Biosciences High Fidelity (PacBio HiFi) library preparation with the SMRTbell Express Template Prep Kit 2.0 (Pacific Biosciences: 101-853-100) and sequencing on one single molecule real-time (SMRT) cell of the PacBio Sequel II at the Australian Genome Research Facility (St Lucia, Australia).Total RNA from the heart, gonad, kidney, liver and spleen was sequenced as 100 bp paired-end (PE) reads using an Illumina Novaseq 6000 with Illumina Stranded mRNA library preparation at the Ramaciotti Centre for Genomics (University of New South Wales, Kensington, Australia).

REVISED Amendments from Version 1
We have addressed the reviewers comments around the mitogenome assembly and addressed the other minor comments.
Any further responses from the reviewers can be found at the end of the article

Mitochondrial assembly
The mitochondrial genome was identified from the reference genome assembly using MitoHiFi v2 (Allio et al., 2020;Uliano-Silva et al., 2023).MitoHifi identified the most taxonomically closely related publicly available mitochondrial genome as the thick-billed parrot (Rhynchopsitta pachyrhyncha) (NCBI reference sequence OR209192.1).The mitochondrial reference sequence for the thick-billed parrot was then used to search for the swift parrot mitochondrial genome.The identified mitochondrial sequence was then added to the genome assembly and annotated using MITOS v 2.1.7(Donath et al., 2019) and visualised using Proksee (Grant et al., 2023).

Genome annotation
Genome annotation was performed using FGENESH++ v7.2.2 (Softberry; RRID:SCR_018928 (Solovyev et al., 2006)) using the longest open reading frame as predicted from the global transcriptome, non-mammalian settings and optimised parameters supplied with the American crow (Corvus brachyrhynchos) gene finding matrix, which is the closest related species with a gene finding matrix provided by FGENESH++. BUSCO v5.4.6 (Simao et al., 2015) in protein mode was run on Galaxy Australia to assess the completeness of the annotation with the vertebrata_odb10 (n = 3354) and aves_odb10 (n = 8338) lineage.The 'genestats' script (https://github.com/darencard/GenomeAnnotation)was used to obtain the average number of exons and introns and the average exon and intron length.

Genome assembly
Genome assembly using Hifiasm with PacBio HiFi data from a single SMRT cell resulted in a coverage of 28.7x and a genome of 1.24 Gb in size consisting of 847 contigs with a contig N50 of 18.97 Mb and L50 of 20 contigs.The genome assembly was also highly complete with 97.0% of aves_odb10 complete BUSCOs identified (Table 1).The mitochondrial genome was 17,265 bp long and contained 38 genes, including 22 tRNAs and 14 protein coding genes, with a GC percentage of 44.88% (Figure 1).

Transcriptome assembly and genome annotation
Trimming retained greater than 99.95% of raw reads which were then aligned to the repeat-masked reference genome.Individual tissue transcriptomes had variable mapping rates from 31.04% for heart tissue to 82.76% for gonad tissue (kidney: 62.26%, liver: 78.84%, spleen: 73.60%).The alignment rate for the heart tissue was low so we excluded heart transcripts from downstream analysis.

Introduction:
Line 4: I suggest changing "its important breeding habitat" to something more informative about the type of habitat that the species occupies.Line 8: I suggest changing genetic ecology by a more appropriate concept such as ecological genetics.Line 11: Consider changing "we sequenced DNA with PacBio long reads to generate a draft reference assembly" to something like "we produced a draft genome assembly for the species using PacBio long reads" Methods: Transcriptome assembly: I would encourage the authors to make it clear to the reader why they decided to remove the noncoding transcripts from the transcriptome and why they decided to not use them for the annotation.The important role of long non-coding RNAs in evolution is increasingly being demonstrated (e.g.Toomey et al. 2018;Mattick et al. 2023) and thus it seems strange to ignore them.

Results:
Genome assembly: Regarding the mitogenome, I would suggest mentioning that the mitogenome was circularized.Also, I would consider annotating the two rRNA genes and the control region in Proksee.
Transcriptome assembly and genome annotation: Given the low BUSCO completeness of the annotation, despite the relatively high completeness of the transcriptome, I would encourage the authors to use an alternative genome annotation pipeline that makes a better use of the RNASeq data such as BRAKER3.
In addition, given the high number of genes obtained in the annotation, I would consider doing some kind of filtering of the raw annotation, e.g. using gFACs (Ref 3) to remove potentially missannotated genes.
experience in genome assembly I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Reviewer Report 10 September 2024 https://doi.org/10.5256/f1000research.170611.r317694 © 2024 Benham P.This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Phred Benham
University of Massachusetts Amherst Department of Biology (Ringgold ID: 117236), Amherst, Massachusetts, USA The authors have addressed my comments and those of the other reviewer.I have no further comments to add.
Are the rationale for sequencing the genome and the species significance clearly described?

Phred Benham
University of Massachusetts Amherst Department of Biology (Ringgold ID: 117236), Amherst, Massachusetts, USA This manuscript describes a de novo assembly of the critically endangered swift parrot.
Generally the sequencing and assembly methods reflect current standards and they produce a highly contiguous, contig-level assembly for this species that will be of value for various conservation genomics and other molecular ecology questions.I have only a few minor comments.
Is there a reason you did not attempt to produce a scaffolded assembly?
If there is space it would be nice to have a figure(s) showing the parrot, distribution, etc.
I agree with the other reviewer that this assembly is larger than the typical avian mito-genome and worth confirming there is not spurious sequence included.
Was a voucher specimen preserved of the parrot that died?
The last sentence of the 'mitochondrial assembly' section is not clearly written, please revise.

Luke Silver
The mitochondrial contig listed on NCBI is the correct mitochondrial assembly, it appears that MitoHifi annotates the entire contig which was identified to contain mitochondrial genome even if it is much larger than the expected size.
I had replaced the identified contig with only the portion of sequence which represented the mito genome in the assembly file, however not the data which was used to annotate and produce figure 1.
I have edited and updated the manuscript to reflect the changes."The mitochondrial reference sequence for the thick-billed parrot was then, used to search for the swift parrot mitochondrial genome.The identified mitochondrial sequence was then added to the genome assembly and annotated using MITOS v 2.1.7(Donath et al., 2019) and visualised using Proksee ( Grant et al., 2023)." I have edited the sentence to name the predator "… and the impacts of an introduced predator, the sugar glider (Petaurus breviceps) I have replaced high-quality with draft when referring to the genome assembly throughout I have stated that hifiasm was run with default parameters, "Hifiasm, with default parameters, was run on Galaxy Australia…" Trimmomatic v0.39 (RRID:SCR_011848) (Bolger et al., 2014) with the parameters SLIDINGWINDOW:4:5, LEAD-ING:5, TRAILING:5 and MINLEN:25 and ILLUMINACLIP:2:30:10 with the TruSeq3-PE adapters was used to quality trim reads.The repeat masked genome was indexed and trimmed reads aligned using the -dta parameter with hisat2 v2.1.0(RRID:SCR_015530) (Kim et al., 2019).Resulting sam files were converted to bam format and sorted using samtools v1.9 (Danecek et al., 2021).Stringtie v2.1.6(RRID:SCR_016323) (Pertea et al., 2015) was used to generate a GTF for each transcriptome.Stringtie v2.1.6with the -merge parameter merged transcripts into a global transcriptome retaining only transcripts with an FPKM > 0.1 and length > 30.
The poor performance of the heart tissue is potentially due to the comparatively lower concentration of RNA in the heart tissue extraction (35.2 ng/μl) compared to the other 4 tissues (average = 1243 ng/μl [SD: 481]) and the heart tissue was not stored in RNAlater.After using stringtie -merge to generate a global long interspersed elements (LINEs), comparable with other bird genomes (Zhang et al., 2014) (Table3).

Table 2 .
Statistics of the global transcriptome and annotation of the swift parrot (Lathamus discolor) including BUSCO (Simao et al., 2015) completeness, calculated with both the vertebrata_obd10 and aves_obd10 lineages and average exon length.

Yes Are the protocols appropriate and is the work technically sound? Yes Are sufficient details of the sequencing and extraction, software used, and materials provided to allow replication by others? Yes Are the datasets clearly presented in a usable and accessible format, and the assembly and annotation available in an appropriate subject-specific repository? Yes Competing Interests:
No competing interests were disclosed.

have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard. Version 1
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Are the rationale for sequencing the genome and the species significance clearly described? Yes Are the protocols appropriate and is the work technically sound? Yes Are sufficient details of the sequencing and extraction, software used, and materials provided to allow replication by others? Yes Are the datasets clearly presented in a usable and accessible format, and the assembly and annotation available in an appropriate subject-specific repository? Yes
Why did you use crow for the gene finding matrix.Were there not other parrots or something more closely related?
Competing Interests: No competing interests were disclosed.

have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above. Yes Are the protocols appropriate and is the work technically sound? Yes Are sufficient details of the sequencing and extraction, software used, and materials provided to allow replication by others? Yes Are the datasets clearly presented in a usable and accessible format, and the assembly and annotation available in an appropriate subject-specific repository? No Competing Interests:
No competing interests were disclosed.

have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
Author Response 19 Aug 2024