Genomic Analysis of Highly Virulent Georgia 2007/1 Isolate of African Swine Fever Virus

Sequence information will facilitate research on vaccine development.

We analyzed the complete coding region of the genome of the Georgia 2007/1 strain of ASFV, which was isolated after its introduction to Georgia in 2007. This information provides a baseline for comparison with other isolates obtained during the continued spread of ASF in this region and provides information for vaccine and diagnostic test development.

Viruses and Cells
The Georgia 2007/1 isolate was obtained from tissue samples from pigs submitted to the

Sequence Determination and Analysis
DNA for sequencing was amplified from 100 ng of purified viral DNA by using the Repli-G Kit (QIAGEN, Valencia, CA, USA). This method uses an isothermal multiple displacement amplification and a processive DNA polymerase capable of replicating <100 kbp.
The DNA polymerase has a 3′ → 5′ exonuclease proofreading activity to maintain high fidelity in the amplified products. Nucleotide sequence of the complete coding regions of the genome of the Georgia 2007/1 isolate was determined by using a Roche (Basel, Switzerland) 454 GS FLX sequencer. Analysis of genome sequences, open reading frames (ORFs), and orthologous protein families were conducted by using Artemis (8), Glimmer software (9) and programs available at Viral Bioinformatics-Canada (10,11). ORFs were compared with the related ASFV genome sequences (Mkuzi 1979 isolate, GenBank accession no. AY261362 and Genotype I, Benin 97/1 isolate, GenBank accession no. AM712239) to identify potential frame shifts in the genome that interrupted reading frames. Regions of uncertainty were sequenced by PCR amplification of fragments and Sanger sequencing to confirm the sequence. These uncertainties were located mainly in homopolymer sequences, which have been reported to cause ambiguities during Roche 454 sequencing (12,13). The GenBank accession no. for the genome sequence is FR682468.

Sequence of Coding Regions
The  (9). This analysis identified 189 ORFs, the additional 23, all encoded proteins of <64 aa that lacked sequence similarity with known proteins (Technical Appendix). Eleven of these ORFs overlapped or were entirely within other larger ORFs. Thus, these ORFs are not likely to represent functional genes.

Genome Comparison of the Georgia 2007/1 Isolate with other ASFV Isolates
To determine the phylogenetic relationship between the Georgia 2007/1 isolate and other ASFV isolates (Table), the concatenated amino acid sequences of proteins encoded by 125 conserved ORFs comprising 40,810 aa were compared ( Figure 1). This phylogenic analysis shows that most isolates cluster in 2 main clades. The first group comprises isolates from West

Discussion
The continuing outbreak of ASF in the Caucasus region is caused by a highly virulent strain of ASFV that belongs to genotype II (7). Complete genome sequence analysis provides the most information; as viral genome analysis and sequencing becomes more routine, this procedure will become the method of choice. In the short term, targeted sequence analysis of several ORFs, including those that more closely cluster with that of the concatenated conserved ORF sequences, will provide a more accurate estimate of phylogenetic relationships rather than analysis of 1 ORF such as B646L.
Comparison of the rates of synonomous versus nonsynonomous substitutions across ASFV genes identified 14 or 18 genes that are undergoing positive selection (29). These genes included 2 of the proteins (CD2v and EP153R) that we identified as being most divergent at the amino acid level.
Determination of the sequence of the ASFV isolate that was introduced into the Caucasus region provides a benchmark to which other isolates from this epidemic can be compared. This finding may enable sequence changes to be related to any changes in phenotype of the virus. In addition, detailed knowledge of the sequence will facilitate research on vaccine development by enabling the genes encoded to be expressed and assayed for their ability to confer protection in pigs. It will also facilitate the design of rationally attenuated vaccines by sequential deletion of genes involved in immune evasion and virulence.   Phylogenetic analyses were conducted in MEGA4 (27). Scale bars indicate amino acid substitutions per site.