Complete Coding Sequence of a Chikungunya Virus Strain Imported into Slovenia from Thailand in Late 2018

A case of chikungunya virus infection was imported from Thailand into Slovenia in late 2018. The infection was diagnosed using real-time reverse transcription-PCR, the virus was isolated in cell culture, and the whole genome was sequenced. Phylogenetic analysis of the nearly complete viral genome indicated that the virus belongs to the Indian Ocean lineage but does not possess the A226V mutation in the envelope protein E1.

using the Nextera XT kit, and samples were labeled with the Nextera XT index kit, according to the manufacturer's instructions (Illumina). Libraries were sequenced using the MiSeq reagent kit V3 and the MiSeq system. Sequencing generated more than 3 ϫ 105 reads (paired at 2 ϫ 301 nucleotides [nt]). Data were filtered, adapter sequences were removed with Trimmomatic (default settings, Illuminaclip was performed with in-built adapter sequences for Nextera paired-end libraries) (6), and the genome was de novo assembled using the Unicycler (default settings) (7) on the Galaxy server (8). The average coverage of the final contig is 5,635ϫ, and the GC content is 50.04%. The contig origin was determined by a BLAST search. In comparison to the most similar sequence of CHIKV (GenBank accession number MH400249), the newly assembled genome is 666 nt shorter on the 5= end and 56 nt shorter on the 3= end.
The assembled genome is 11,063 nt long. Open reading frames (ORFs) were found using the NCBI ORFfinder (minimal ORF length set to 600 nt) (9) and manually annotated according to already-annotated sequences of CHIKV found in the UniProt database. The first ORF encodes a 2,247-amino-acid (aa)-long nonstructural polyprotein that is processed in a cell to proteins ns1, ns2, ns3, and ns4. The genome is missing an opal readthrough stop codon between Leu-1629 and Leu-1631 (between ns3 and ns4) and has instead arginine as the 1,630th amino acid of the nonstructural polyprotein. The opal readthrough stop codon between ns3 and ns4 regulates the transcription of ns4. CHIKV circulates in two variants, one with an opal termination codon and the second with an opal codon readthrough. Reports show that the opal codon is important for viral maintenance in vertebrate and invertebrate hosts and that a selective advantage is conferred in Vero cells for the sense (arg) codon (10).
The second ORF is a 1,248-aa-long structural polyprotein and codes for proteins C, E3, E2, 6K, and E1. The ORF does not contain a mutation at the position 226 on the E1 glycoprotein (E1-A226V). This mutation was associated with enhanced transmission by A. albopictus mosquitos in regions where the major mosquito vector, A. aegypti, is absent (11). Such mutation and consequent vector switching could be the cause of the recent spike in CHIKV infections in southern Thailand, but since it is absent from the genome, the underlying cause of the CHIKV spread remains unknown.
The phylogenetic analysis (Fig. 1) indicates that the virus belongs to the Indian Ocean lineage and that it associates with a recent isolate from China (GenBank accession number MH400249) with 99.8% nucleotide identity, based on a best blastn hit (Web BLAST). The alignment of selected genomic sequences was made using the MUSCLE alignment algorithm (12) in the UNIPRO Ugene (version 1.32, default settings) (13), and the phylogenetic analysis was made with IQ-Tree (version 1.6.10 using 100 bootstrap replicates) (14,15). The phylogenetic tree was drawn with FigTree (version 1.4.4).
Data availability. The genome sequence was deposited in GenBank under accession number MK848202. The raw reads were deposited into the SRA under accession number SRR9077329.