The first genome sequences of human bocaviruses from Vietnam

As part of an ongoing effort to generate complete genome sequences of hand, foot and mouth disease-causing enteroviruses directly from clinical specimens, two complete coding sequences and two partial genomic sequences of human bocavirus 1 (n=3) and 2 (n=1) were co-amplified and sequenced, representing the first genome sequences of human bocaviruses from Vietnam. The sequences may aid future study aiming at understanding the evolution of the virus.

HBoV-1 was first discovered in 2005 1 . Since then three additional HBoV species, namely HBoV-2, HBoV-3 and HBoV-4, have been discovered 2,4 . While the clinical significance of HBoV remains unknown, worldwide their prevalence in respiratory/gastrointestinal tracts varies between 0-26% 5,6 . In Vietnam, the reported prevalence of HBoV was 2-17% 7-10 . Currently, there is relatively limited sequence information, especially at genome-wide level, of HBoV from Vietnam, although such knowledge may be essential for the development of sensitive, specific diagnostic PCR for the local viral strains, and may aid future investigation documenting the circulation and spread of the viruses at global scale.
Herein we report the recovery of two complete coding sequences (CDS) and two partial genomic sequences of HBoV from swabs of Vietnamese children enrolled in our ongoing hand, foot and mouth disease (HFMD) research program in Ho Chi Minh City. The research program aims to look at various disease aspects, including pathogen evolution and its potential implication for vaccine development and implementation.

Methods and results
Whole-genome sequencing of the dominant pathogens (including coxsackievirus A6 (CV-A6), CV-A10 and CV-A16) were performed on 296 RT-PCR positive swabs using an in-house MiSeq-based approach 11 . In brief, 110 µl of selected swabs were centrifuged at 13,500 rpm for 10 minutes to remove host cells or large cellular components. After DNAse treatment, viral nucleic acid (NA) was then isolated from 100 µl of supernatant using QIAamp viral RNA kit (QIAgen GmbH, Hilden, Germany), and recovered in 50 µl of elution buffer (provided with the kit). Ten microliter of the isolated NA was subjected to cDNA synthesis using Super Script III kit (Invitrogen, Carlsbad, CA, USA) and FR26RV-Endoh primer (primer sequences can be found elsewhere 11 ). The cDNA was then converted to double-stranded DNA using exo-Klenow (Invitrogen), and subsequently preamplified using Platinum PCR supermix (Invitrogen) and FR20RV primer 11 . PCR product was then purified and subjected to library preparation using Nextera XT DNA sample preparation kit (Illumina, San Diego, CA, USA) and was finally sequenced using MiSeq reagent kits (Illumina) in an Illumina MiSeq platform (Illumina) 11 .
After reference-based mapping 11 to generate the complete genome sequences of the targeted enteroviruses using Geneious software v 8.1.5 (Biomatters, Ltd, Auckland, New Zealand), the remaining reads were then subjected to publicly available metagenomic pipelines; Taxonomer 12 and Sequence-based Ultra-Rapid Pathogen Identification (SURPI) 13 to explore the contents of non-enteroviral sequences in the tested swabs. Evidence of bocavirus sequences were found in four swabs (including 3 throat-and 1 rectal swabs). A reference-based mapping approach using Geneious software (Biomatters) 11 was then employed to recover the HBoV genomes from the corresponding dataset. Subsequently, 2 CDS (1 from a throat swab with 4925 bp in length and the other from a rectal swab with 4898 bp; i.e. over 90% of genome coverage) were successfully assembled with a mean coverage of 1,922 and 3,745, respectively. In the other two datasets from the remaining 2 swabs only partial genomic sequences of HBoV, each with 2870 bp in length and a mean coverage of 15.4 and 448.7, were recovered.
Subsequent sequence alignment and phylogenetic analysis using MUSCLE 14 and Neighbor-joining available in Geneious (Biomatters), respectively ( Figure 1) revealed that all 3 Vietnamese HBoV recovered from the throat swabs belonged to HBoV-1 and had >98% of sequence similarity at nucleotide level with other HBoV-1.
The other belonged to HBoV-2 and had a close relatedness with a Thai strain CU54TH (GU048663) with a sequence similarity of 97.3% ( Figure 1). Similar results were obtained when the analyses were done for 3 individual open reading frames (ORF1, ORF2, and ORF3) of the virus genome ( Figure 1).
All the four HFMD patients (including 3 CV-A6 and 1 CV-A12, Table 1) in whom HBoV was detected had mild HFMD, and were enrolled in November 2013 -March 2014. Three had vomiting, and two presented with runny nose and cough (Table 1).

Discussion
Herein we reported for the first time 2 complete CDS alongside two other partial genomics sequences of HBoV from Vietnam. Phylogenetically, the four HBoVs from Vietnam were closely related to other HBoV strains sampled from various countries worldwide, reflecting a wide distribution of these HBoV lineages at global scales.
All three HBoV detected in throat swabs belong to species 1, while the remaining virus detected in rectal swab was HBoV-2. This is in line with previous reports regarding the frequent detection of HBoV-1 and HBoV-2 in throat-and rectal swab, respectively 5,6,15-18 , albeit our sample size was small. Likewise, all the four HFMD patients in whom HBoVs were found were enrolled into our HFMD study during the seasonal peak of HBoV in southern Vietnam 8 .
Although the pathogenic potential of HBoV infections remains unknown, clinical signs/symptoms such as vomiting, runny nose and cough were also commonly recorded among HFMD patients in previous reports 19-21 . HBoV has commonly been co-detected with other pathogens in respiratory and gastrointestinal tracts 5,10,16-18 . It was also previously detected in fecal samples of HFMD patients from Thailand 22 . Clearly, further research is needed to ascribe the contribution of coinfections to clinical manifestation and pathogenesis of HFMD. Of note, previous reports showed that there might be an association between coinfections with other viral pathogens such as norovirus and rotavirus and clinical severity of HFMD patients 23 .
In conclusion, to the best of our knowledge, we are the first to report the complete CDS of HBoVs from Vietnam. The contribution of HBoV to clinical manifestation of HFMD requires further research.

Consent
The clinical samples used in this study were derived from an on-going HFMD study in three referral hospitals in Ho Chi Minh city, Vietnam. The study was reviewed and approved by the local Institutional Review Boards and the Oxford Tropical Research Ethics Committee (OxTREC), University of Oxford, Oxford, United Kingdom. Written informed consent was obtained from parent or legal guardian of each participant.
Author contributions TTT and LVT: designed the study, analysed the test results, and drafted the manuscript. HMTV, NTTH, LNTN, NTA, HMT, HVH, NMT, TTK, THK, LNTN, NTH, NVVC, GT, and RHvD: enrolled patients, took samples and did laboratory testing. All authors have read the final manuscript and agreed with its contents.

Competing interests
No competing interests were disclosed.

Grant information This work was supported by the Wellcome Trust [101104/Z/13/Z], [106680/B/14/Z].
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. 1.

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
No competing interests were disclosed. Human bocavirus (HBoV) was first identified in 2005, and regarded as a causative pathogen of respiratory tract diseases. The paper reported the recovery of two complete coding sequences and two partial genomic sequences of HBoV from swabs of Vietnamese children enrolled in HFMD research program in Ho Chi Minh City. The experiments were well designed and performed. In addition, the results were described almost properly. In this meaning, the manuscript is sound and suitable for the indexing. However, as I mentioned below, this manuscript has still some points to be clearly explained.
Comparison of the genome sequences of human bocaviruses between from Vietnam and the others should be much more discussed.
To date, all of the HBoV genotypes contain the episomal structure. Then, it would be better to analyze it in the paper.
The genome of HBoV is organized in three ORFs: ORF1 encoding NS1 protein; ORF2 encoding NP1 protein; ORF3 encoding VP1 and VP2 proteins. So it was suggested that the Phylogenetic trees of nucleotide and amino acid sequences of the HBoV genes should be constructed.

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
No competing interests were disclosed.

Competing Interests:
Author Response 03 Jan 2017

, Oxford University Clinical Research Unit, Vietnam Thanh Tran Tan
Comparison of the genome sequences of human bocaviruses between from Vietnam and the others should be much more discussed. :

Response
We have now discussed this in the discussion section. The second sentence of the discussion reads "Phylogenetically, the four HBoVs from Vietnam were closely related to other HBoV strains sampled from various countries worldwide, reflecting a wide distribution 3.
other HBoV strains sampled from various countries worldwide, reflecting a wide distribution of these HBoV lineages at global scales".
To date, all of the HBoV genotypes contain the episomal structure. Then, it would be better to analyze it in the paper. :

Response
We thank the referee for this comment. Please forgive our ignorance but we understood that episomal structure is formed by repeated sequences at the 5' and 3' ends, which unfortunately were not fully sequenced. Therefore the analysis could not be done reliably.
The genome of HBoV is organized in three ORFs: ORF1 encoding NS1 protein; ORF2 encoding NP1 protein; ORF3 encoding VP1 and VP2 proteins. So it was suggested that the Phylogenetic trees of nucleotide and amino acid sequences of the HBoV genes should be constructed. :

Response
We have reconstructed additional phylogenetic trees according to the suggestion of Dr Xiu-ling Ji. We have therefore added those additional phylogenetic to this revised version, and added a sentence to elaborate it in the result section; "Similar results were obtained when the analyses were done for 3 individual open reading frames, ORF1, ORF2 and ORF3 (Figure 1) Although the article adds novel information on HBoV epidemiology in Vietnam and presents the sequences of current strains in this geographic region, it has numerous technical shortcomings. The methodology used is not sufficiently described. E.g., the authors state that coxsackieviruses were detected by whole genome sequencing. This technique, however, would detect genomic host DNA, coxsackie is an RNA virus. Moreover, primer sequences and PCR protocols are missing, and alignments are not shown. The reference list is extremely short, and the overall description of methods, results and discussion is weak.
We have read this submission. We believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.
No competing interests were disclosed.

Response
We thank Dr Verena Schildgen and Dr Oliver Schildgen for their constructive comments Please . allow us to clarify that the method used was developed in our laboratory for amplification and sequencing viral sequences, and it has been published (citation #11). Although host DNA can also be simultaneously sequenced, investigation of its presence in the obtained reads is beyond the scope of the present Research Note. Given the method used was detailed in our previous publication (including primer sequences), we have chosen to briefly present it in this revised version as per the reviewers' comment. Likewise, we have updated our reference list (from 11 to 23 references), and added more text to the discussion section. Please also refer to our responses to the other reviewers regarding the updated result section, while we hope the reviewers appreciate that the format of a Research Note is concise.
No competing interests were disclosed. This paper describes the first complete genome sequences of human bocaviruses in Vietnam. Reporting complete genome sequences from potential local viral pathogens is vital to develop accurate diagnostic methods and to perform additional studies. However, depending on the scope of this journal, this paper would also be suitable for publication in the journal Genome Announcements.
The paper is very compact and carefully written, however I feel that some essential information is missing: The authors show that they have detected 4 bocaviruses in enterovirus positive samples, as identified by RT-PCR, which demonstrates the strength of agnostic deep sequencing. However, could it be that the symptoms observed in these patients was caused by these enteroviruses and not by the bocaviruses detected in these samples? And which specific enteroviruses (or other viral pathogens) were detected in these bocavirus positive samples?
Three viruses were found in throat swabs while one virus was found in rectal swabs. Which virus was found in which sample? E.g. was the genome coverage in the rectal swab lower and does this perhaps also explain the different species detected -was species 2 found in rectal swabs and species 1 in throat swabs? And some minor points: In the abstract the authors mention that "The sequences may aid future study aiming at understanding the evolution of the pathogen". However, in the introduction and conclusion they mention that the clinical significance of bocavirus infection remains unknown. I suggest to change the word pathogen by virus.
The phylogenetic tree is difficult to read in the current resolution