Experimental and analytical pipeline for sub-genomic RNA landscape of coronavirus by Nanopore sequencer

ABSTRACT Coronaviruses (CoVs), including severe acute respiratory syndrome coronavirus 2, can infect a variety of mammalian and avian hosts with significant medical and economic consequences. During the life cycle of CoV, a coordinated series of subgenomic RNAs, including canonical subgenomic messenger RNA and non-canonical defective viral genomes (DVGs), are generated with different biological implications. Studies that adopted the Nanopore sequencer (ONT) to investigate the landscape and dynamics of viral RNA subgenomic transcriptomes applied arbitrary bioinformatics parameters without justification or experimental validation. The current study used bovine coronavirus (BCoV), which can be performed under biosafety level 2 for library construction and experimental validation using traditional colony polymerase chain reaction and Sanger sequencing. Four different ONT protocols, including RNA direct and cDNA direct sequencing with or without exonuclease treatment, were used to generate RNA transcriptomic libraries from BCoV-infected cell lysates. Through rigorously examining the k-mer, gap size, segment size, and bin size, the optimal cutoffs for the bioinformatic pipeline were determined to remove the sequence noise while keeping the informative DVG reads. The sensitivity and specificity of identifying DVG reads using the proposed pipeline can reach 82.6% and 99.6% under the k-mer size cutoff of 15. Exonuclease treatment reduced the abundance of RNA transcripts; however, it was not necessary for future library preparation. Additional recovery of clipped BCoV nucleotide sequences with experimental validation expands the landscape of the CoV discontinuous RNA transcriptome, whose biological function requires future investigation. The results of this study provide the benchmarks for library construction and bioinformatic parameters for studying the discontinuous CoV RNA transcriptome. IMPORTANCE Functional defective viral genomic RNA, containing all the cis-acting elements required for translation or replication, may play different roles in triggering cell innate immune signaling, interfering with the canonical subgenomic messenger RNA transcription/translation or assisting in establishing persistence infection. This study does not only provide benchmarks for library construction and bioinformatic parameters for studying the discontinuous coronavirus RNA transcriptome but also reveals the complexity of the bovine coronavirus transcriptome, whose functional assays will be critical in future studies.

• Upload point-by-point responses to the issues raised by the reviewers in a file named "Response to Reviewers," NOT IN YOUR COVER LETTER • Upload a compare copy of the manuscript (without figures) as a "Marked-Up Manuscript" file • Upload a clean .DOC/.DOCX version of the revised manuscript and remove the previous version • Each figure must be uploaded as a separate, editable, high-resolution file (TIFF or EPS preferred), and any multipanel figures must be assembled into one file • Any supplemental material intended for posting by ASM should be uploaded separate from the main manuscript; you can combine all supplemental material into one file (preferred) or split it into a maximum of 10 files, with all associated legends included For complete guidelines on revision requirements, see our Submission and Review Process webpage.Submission of a paper that does not conform to guidelines may delay acceptance of your manuscript.
Data availability: ASM policy requires that data be available to the public upon online posting of the article, so please verify all links to sequence records, if present, and make sure that each number retrieves the full record of the data.If a new accession number is not linked or a link is broken, provide Spectrum production staff with the correct URL for the record.If the accession numbers for new data are not publicly accessible before the expected online posting of the article, publication may be delayed; please contact production staff (Spectrum@asmusa.org)immediately with the expected release date.
Publication Fees: For information on publication fees and which article types are subject to charges, visit our website.If your manuscript is accepted for publication and any fees apply, you will be contacted separately about payment during the production process; please follow the instructions in that e-mail.Arrangements for payment must be made before your article is published.

ASM Membership:
Corresponding authors may join or renew ASM membership to obtain discounts on publication fees.Need to upgrade your membership level?Please contact Customer Service at Service@asmusa.org.
The ASM Journals program strives for constant improvement in our submission and publication process.Please tell us how we can improve your experience by taking this quick Author Survey.
Thank you for submitting your paper to Spectrum.

Sincerely, Biao He Editor Microbiology Spectrum
Reviewer #1 (Comments for the Author): The manuscript "Experimental and analytical pipeline for sub-genomic RNA landscape of coronavirus by Nanopore sequencer" reported the study on bovine coronavirus (BCoV) utilized four different ONT protocols and rigorous bioinformatic analysis, determining optimal parameters for identifying defective viral genome (DVG) reads.The proposed pipeline demonstrated high sensitivity and specificity in detecting informative DVG reads, offering benchmarks for studying discontinuous coronavirus RNA transcriptomes.The manuscript is well written and complete in all aspect.Here are my minor suggestions given below to improve the quality of current version of manuscript: 1.The mechanism of pathogenesis of SARS-CoV-2 viral genome is not discussed in the study as discussed in the following study.https://doi.org/10.1016/j.mehy.2020.110031 2. The structural information of drug targets, RBD and receptor domains; ACE2 mutations which are resistant to COVID-19 infection should also be discussed are missing which must be discussed in the introduction as discussed in the following studies.https://doi.org/10.1038/s41598-022-20773-9https://doi.org/10.1007/s00726-021-02991-z 3.There are molecules used to combat COVID-19, which should also be discussed in the current study that is related to the study.https://doi.org/10.1016/j.jphs.2023.02.004 https://doi.org/10.1016/j.mehy.2020.110031Reviewer #3 (Comments for the Author): 1.It is recommended to briefly introduce the role of 5' phosphate-dependent exonuclease processing in the Introduction, along with an explanation of using the RCS, which may give readers a clearer understanding of the experimental design.2. As mentioned in the Results, "Higher reads from RNA libraries were observed than those from cDNA libraries."It may be due to the higher initial input of DRS than others.Using the same initial input for this comparison will be more convincing.3. "In particular, protocol #2 (RNA without exonuclease-treatment protocol) showed the highest proportion of reads mapped to BCoV (Supplementary Fig S1C)."A further discussion to explain the reason is recommended.4. In comparing RNA_exo1 and RNA1, the number of Human host reads increased by the exonuclease processing, while the opposite result was found in comparing RNA_exo2 with RNA2.Is it reasonable to explain this only in terms of "the removal of highly abundant rRNA, which leaves more pores free for RNA sequencing"?5.In Figure 2, legend titles should be added to the graph, such as "number of fragment reads" and "k-mer size," the same applies to Figure 3. 6. Keep the aspect ratio when stretching the image in Figure 5. 7. The image resolution in the Supplementary needs to be improved.8. Several grammatical mistakes should be corrected.

Dear Editor,
We would like to thank the reviewers for their constructive comments.We have prepared a revised version of our manuscript and answered the questions/comments raised by the reviewers point-by-point below.Revisions in the manuscript were also highlighted in red.
Comments from Reviewer #1 1.The mechanism of pathogenesis of SARS-CoV-2 viral genome is not discussed in the study as discussed in the following study.

Reply:
We are thankful for the reviewer's comment and such discussion is appended in page 17, line 467-479 and page 5-6, line 102-111 as the following.
 Although most people infected with SARS-CoV-2 develop a mild to moderate disease with virus replication restricted mainly to the upper airways, some progress to having a life-threatening acute respiratory distress While the cis-acting elements in the 3' end of the genome consist of a bulged stem loop (BSL), pseudoknot (PK), hypervariable region (HVR) and poly(A) tail [14,15].Functional DVGs (so-called DiRNA), if contain the cis-acting elements required for translation or replication, may play different roles in triggering cell innate immune signaling, interfering with the sgmRNA transcription/translation, or assisting in establishing persistence infection [16,17].
2. The structural information of drug targets, RBD and receptor domains; ACE2 mutations which are resistant to COVID-19 infection should also be discussed are missing which must be discussed in the introduction as discussed in the following studies.

Reply:
We are thankful for the reviewer's comment and further introduction is appended 3.There are molecules used to combat COVID-19, which should also be discussed in the current study that is related to the study.

Reply:
We The algorithms developed in this study could serve the platform identifying functional and interfering DVGs.
Comments from Reviewer #3 1.It is recommended to briefly introduce the role of 5' phosphate-dependent exonuclease processing in the Introduction, along with an explanation of using the RCS, which may give readers a clearer understanding of the experimental design.

Reply:
We are appreciated with the reviewer's comment and the revision are made for  (tRNA), was tested in this study for its influence on BCoV transcriptome. As the unmethylated RNA calibration standard (RCS), used to assess the false detection rate of the methylation calling of RNA molecule, is only offered in DRS kit, 53.3 and 26.7% of reads on average from protocol #1 and #2 were mapped to RCS with higher RCS reads proportion observed in exonuclease treatment libraries (Table 1).
2. As mentioned in the Results, "Higher reads from RNA libraries were observed than those from cDNA libraries."It may be due to the higher initial input of DRS than others.Using the same initial input for this comparison will be more convincing.

Reply:
As reviewer mentioned, using the same initial input for this comparison will be more convincing.However, Nanopore ONT Minion requires strict input amounts to the chips, which is 500 ng in direct RNA sequencing (DRS) and 250 ng in direct cDNA sequencing (DCS).Higher or lower input may affect the sequencing performance.Therefore, each protocol has their own initial input suggested by the manufacturer to help achieve the final product amount loaded to the chips in this study.Furthermore, we didn't observe any difference in the percentages of fragment numbers or DVG types between DRS and DCS under the pre-set cutoff of different parameters.We are appreciated with the reviewer's comment and clarification is made in Page 8, line 176-177 as the following. Higher reads from RNA libraries were observed than those from cDNA libraries, possibly due to higher initial RNA input during the library constructions as suggested by the manufacture's protocols.
3. "In particular, protocol #2 (RNA without exonuclease-treatment protocol) showed the highest proportion of reads mapped to BCoV (Supplementary Fig S1C)."A further discussion to explain the reason is recommended.

Reply:
We are thankful for the reviewer's comment and further discussion is appended in page 14, line 377-383 as the following.
 The lower abundancy after exonuclease treatment in both RNA and cDNA 4. In comparing RNA_exo1 and RNA1, the number of Human host reads increased by the exonuclease processing, while the opposite result was found in comparing RNA_exo2 with RNA2.Is it reasonable to explain this only in terms of "the removal of highly abundant rRNA, which leaves more pores free for RNA sequencing"?

Reply:
By reviewing the literature, 20-30% of RCS among the total reads is very commonly observed.On the contrary, the high amount of RCS (46-76%) in RNA-exo group found in this study is not common.The possible explanation of the effect of 5' phosphate-dependent exonuclease could be the removal of the abundant rRNA, which leaves more pores free for RNA sequencing as original stated in the manuscript.
Another explanation could be the RNA degradation during the process.We are appreciated with the reviewer's comment and minor revision are made for clarity in page 8, line 187-189 as the following.
 Therefore, the relatively high percentage of RCS found in protocol #1 could be due to the removal of highly abundant rRNA, which leaves more pores free for RNA sequencing, especially RCS, although the possibility of a higher percentage of RNA degradation by exonuclease treatment could not be ruled out.
5. In Figure 2, legend titles should be added to the graph, such as "number of fragments reads" and "k-mer size," the same applies to Figure 3.

Reply:
We are thankful for the reviewer's advice.The legend titles are appended to Figure 2 and Figure 3 as suggested 6. Keep the aspect ratio when stretching the image in Figure 5.

Reply:
Thanks for the reviewer's comment.We re-produce Figure 5 without stretching with the correct aspect ratio.
7. The image resolution in the Supplementary needs to be improved.

Reply:
Thanks for the reviewer's kind advice.We reproduce all the supplementary figures and are sure they meet the journal criteria.
8. Several grammatical mistakes should be corrected.

Reply:
We appreciated the reviewer's comment.The manuscript was edited by the native English speaker and the grammatical mistakes have been properly corrected.
February 26, 2024 1st Revision -Editorial Decision Re: Spectrum03954-23R1 (Experimental and analytical pipeline for sub-genomic RNA landscape of coronavirus by Nanopore sequencer) Dear Dr. Day-Yu Chao: Your manuscript has been accepted, and I am forwarding it to the ASM production staff for publication.Your paper will first be checked to make sure all elements meet the technical requirements.ASM staff will contact you if anything needs to be revised before copyediting and production can begin.Otherwise, you will be notified when your proofs are ready to be viewed.Data Availability: ASM policy requires that data be available to the public upon online posting of the article, so please verify all links to sequence records, if present, and make sure that each number retrieves the full record of the data.If a new accession number is not linked or a link is broken, provide production staff with the correct URL for the record.If the accession numbers for new data are not publicly accessible before the expected online posting of the article, publication may be delayed; please contact ASM production staff immediately with the expected release date.
Publication Fees: For information on publication fees and which article types have charges, please visit our website.We have partnered with Copyright Clearance Center (CCC) to collect author charges.If fees apply to your paper, you will receive a message from no-reply@copyright.com with further instructions.For questions related to paying charges through RightsLink, please contact CCC at ASM_Support@copyright.com or toll free at +1-877-622-5543.CCC makes every attempt to respond to all emails within 24 hours.
ASM Membership: Corresponding authors may join or renew ASM membership to obtain discounts on publication fees.Need to upgrade your membership level?Please contact Customer Service at Service@asmusa.org.
PubMed Central: ASM deposits all Spectrum articles in PubMed Central and international PubMed Central-like repositories immediately after publication.Thus, your article is automatically in compliance with the NIH access mandate.If your work was supported by a funding agency that has public access requirements like those of the NIH (e.g., the Wellcome Trust), you may post your article in a similar public access site, but we ask that you specify that the release date be no earlier than the date of publication on the Spectrum website.
Embargo Policy: A press release may be issued as soon as the manuscript is posted on the Spectrum Latest Articles webpage.The corresponding author will receive an email with the subject line "ASM Journals Author Services Notification" when the article is available online.
The ASM Journals program strives for constant improvement in our submission and publication process.Please tell us how we can improve your experience by taking this quick Author Survey.
Thank you for submitting your paper to Spectrum.
Sincerely, Biao He Editor Microbiology Spectrum syndrome (ARDS) with predispositions leading to immunopathology [54].In vitro and clinical studies have shown that SARS-CoV-2 is a poor inducer of interferon [55].Patients with mild COVID-19 disease showed extensive induction of type I and III interferon responses, whereas patients with severe disease demonstrated poor response in their antiviral capacity despite higher local inflammatory myeloid cell populations and equivalent viral loads [56, 57].Higher local inflammatory response of the epithelium and endothelium triggers the imbalance between the activation of coagulation and the inhibition of fibrinolysis [58].Such cascade of inflammation and coagulation interplayed among monocytes, macrophage and neutrophils were further amplified and eventually lead to severe immunopathology [59].Corticosteroids are frequently used as general inhibitors of inflammation and 50% reduction in mortality by administration of dexamethasone to patients with severe COVID-19 was observed [60].A neutralizing monoclonal antibody to IL-6 were also shown to increase survival in hospitalized patients [61]. There are kinds of cis-acting elements, required for viral genome transcription, gene expression and pathogenesis, in the 5' and 3' termini of coronaviral genome.The cis-acting elements in the 5' terminal of the genome are composed of multiple stem-loops (SLs), including SL I to VII.
in page 6, Line 113-129 as the following. The S protein which mediates the binding with cellular receptors for infection are a characteristic feature of the coronaviridae family.Both SARS-CoV-2 and SARS-CoV bind to a common human receptor, angiotensin-converting enzyme 2 (ACE2), which is also the receptor for other human CoVs except MERS-CoV [18].The S protein consists of two subunits: The S1 unit at the N-terminus of S protein forms the head that contains receptor-binding domain (RBD) is responsible for cellular receptor binding; whereas the S2 unit presents in the stalk of S protein mediating the fusion process for viral entry [18].These two subunits are separated by the site, which contains a furin cleavage motif and is cleaved by the transmembrane serine protease TMPRSS2 in the virus-producing cell [19].This cleavage activates the S2 subunit trimers to fuse viral and host lipid bilayers, releasing the viral ribonucleoprotein complex into the cell.Amino acid variations in human ACE2 proteins have been suggested to mediate RBD binding affinity, which could either enhance or inhibit virus entry [20].As such, vesicles designed to carry the S protein or RBD could be used to antagonize virus entry [21-23].Alternatively, extracellular vesicles (EVs), derived from stem cells that carry ACE2, could be used to treat infections by coronaviruses [24, 25].
clarity in Page 7, line 153-158 and Page 8, Line 181-182 as the following. Since the majority of host messenger RNA (mRNA) or viral RNA transcripts are protected from degradation by m7Gppp cap and triphosphate [36], the 5' phosphate-dependent exonuclease, by removing the RNA population with 5' monophosphate group such as ribosomal RNA (rRNA) and transfer RNA libraries could be due to the depletion of cap-free viral RNA by 5' phosphate-dependent exonuclease [43], which leads to fewer types of DVG observed (Fig 2C).Moreover, the possible explanations of highest proportion of reads mapped to BCoV using RNA without exonuclease-treatment protocol could be the biased types and less abundancy of the transcripts introduced during either the RT-PCR step [29] or additional clean-up steps during cDNA library construction.
are thankful for the reviewer's comment and further discussed in page 17-18, occurred or synthetic, are capable of reducing viral RNA levels by competing the cellular or viral resources and could serve as the anti-viral agents[67, 68].