Exclusive and Common Subsets of Zika Virus Polyprotein Mutants

Two subsets of Zika virus (ZIKV) polyprotein amino acid positions are identified. One subset (Exclusive) consists of mutating amino acid positions which were found only in polyproteins isolated from ZIKV of human origin. A second subset (Common) consists of mutating amino acid positions which were found both in ZIKV polyproteins of human origin and in ZIKV polyproteins of Aedes species mosquito origin. The dominance of the Exclusive subset in the polyprotein was found to range from the N-terminus structural proteins until non-structural protein NS3. Although no longer greater than the Common subset, elements of the Exclusive subset existed almost to the C-terminus of non-structural protein NS5. These results are considered in the context of reported epitopic and other biological characteristics of ZIKV. Exclusive and Common Subsets of Zika Virus Polyprotein Mutants Joel K Weltman*


Introduction
Zika virus (ZIKV) causes microcephaly, brain abnormalities and other neural diseases in gestating infants who are infected in utero [1,2]. Because of the highly adverse effects of ZIKV on gestating infants, it is important to develop preventive vaccines and therapeutic agents.
Reported here is a bioinformatics analysis of ZIKV polyprotein that sorts mutations which occurred exclusively in ZIKV isolated from infected human hosts (Exclusive subset) and mutations which occurred in ZIKV isolated both from infected humans and from infected Aedes mosquito vectors (Common subset). It is proposed that the exclusive mutations may reflect immunological and other biological processes which occur exclusively in human hosts but not in the mosquito vector. These exclusive mutational objects thus may help guide development of human host-specific targets for vaccines and drugs.
Computations were performed using Anaconda 2.4.0 (64-bit) with Python 2.7.10, Numpy 1.10.1, Scipy 0.16.0, Sympy 1.0 and Matplotlib 1.4.3. Information entropy (H) was computed by the equation of Shannon [5] and is expressed in bits. Amino acid positions with H>0.0 in the polyprotein were classified and sorted into subsets depending whether the positive H value occurred at amino acid positions in ZIKV polyproteins obtained only from humans (Exclusive subset) or at amino acid positions in ZIKV polyproteins obtained both from humans and from Aedes species of mosquitos (Common subset).
The Mann-Whitney nonparametric U test was performed with Scipy stats; a two-tail p-value is reported. Z-tests were performed using 1000 pseudo-random trials and are reported with two-tail probabilities.

Results and Discussion
The basis for the definition of mutational subsets within the set of ZIKV amino acid positions is shown in Figure 1. As previously reported [6], plots of mutations in ZIKV isolated from humans reveal the mutations which occurred exclusively in viruses isolated from humans (top) and in mutations common to viruses isolated both from humans and from Aedes mosquitos (bottom). This sorting of ZIKV mutations into Exclusive and Common subsets is consistent with immunological and metabolic processes which occur in humans but not in mosquitos and with structural processes common to both humans and mosquitos. In equations 1 and 2, the discrete summation curves are used   to generate approximations as continuous, indefinite integrals. Difference curves (∆sum, Figure 2c) for the observed summation and approximated polynomials were obtained by subtraction: ∆sum = equation (1) -equation (2). The amino acid position with maximum value in the directly computed ∆sum curve is position 2068, which is within the NS3 non-structural protein domain of the polyprotein (Figure 2d). The amino acid position with maximum value in the ∆sum curve computed from the polynomial approximation is position 1609, which also is within the NS3 non-structural protein domain of the polyprotein. There were 69 amino acid position in the human Exclusive subset, from position 2068 to the C-terminal, with an H sum of 5.4103 bits, which is 33.17% of the Exclusive subset observed total.
Differentiation of the indefinite integrals in Equations 1 and 2 yield the following 2 equations, which were used to produce the H distribution curves in Figure 3: The data in Figures 2 and 3 suggest that dominance of the human Exclusive subset of mutations extends beyond the structural protein components of the polyprotein. Precursors of the structural proteins comprise amino acids 1-794 of the polyprotein. The greater mutability of the elements of the human Exclusive subset, in comparison with the Common subset, probably represents functions of processes occurring in the human host but not in the mosquito vector. Such processes may include immunological escape [7], antibody dependent enhancement [8] and other biochemical and biophysical factors present in the host but not in the mosquito vector. For example, viral interactions with interferons are important determinants of outcome of ZIKV infection [9]. There are other, specific immunological human host factors that affect the Zika virus. Neutralizing antibodies against the prM precursor membrane protein (prM) and envelope E protein have been reported [10][11][12]. The prM and E proteins initially exist as precursors within the structural domain of the polyprotein. Antibody and T-cell responses against the NS1, NS2, NS3, NS4B and NS5 non-structural polyproteins have been reported [10,11,13,14].
The distribution of the elements of the human Exclusive subset is in agreement with the physiological and immunological Zika virus-host interaction processes discussed above. The activity of the Exclusive subset is detectable well into the non-structural domain of the polyprotein. A more detailed elucidation of the mutational pattern in the human Exclusive subset may increase our understanding of Zika virus biology, thereby facilitating the development of anti-Zika strategies, preventives and therapeutics.