Predictability of COVID-19 worldwide lethality using permutation-information theory quantifiers

This paper examines the predictability of COVID-19 worldwide lethality considering 43 countries. Based on the values inherent to Permutation entropy (Hs) and Fisher information measure (Fs), we apply the Shannon-Fisher causality plane (SFCP), which allows us to quantify the disorder an evaluate randomness present in the time series of daily death cases related to COVID-19 in each country. We also use Hs and Fs to rank the COVID-19 lethality in these countries based on the complexity hierarchy. Our results suggest that the most proactive countries implemented measures such as facemasks, social distancing, quarantine, massive population testing, and hygienic (sanitary) orientations to limit the impacts of COVID-19, which implied lower entropy (higher predictability) to the COVID-19 lethality. In contrast, the most reactive countries implementing these measures depicted higher entropy (lower predictability) to the COVID-19 lethality. Given this, our findings shed light that these preventive measures are efficient to combat the COVID-19 lethality.


Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the infectious agent of the coronavirus disease-19 , that have caused pandemic outbreak disease with social and economic impacts around the world [1].The Coronaviridae family contains the SARS-CoV-2 and another two virus that causes severe acute respiratory syndrome: the SARS-CoV and the MERS-CoV. These two viruses had caused an epidemic outbreak with high pathogenicity and mortality in some countries of the world in the past [2].
COVID-19 uses the same mechanism for entering host cells than SARS, but at slower speeds and accumulates, leading to a more extended incubation period and more contagious. Simultaneously, SARS presents with more symptoms, and disease severity [3]. Due to its high transmissibility, COVID-19 has caused a more significant absolute number of deaths than the combination of epidemics produced by SARS-CoV and MERS-CoV [4].
About only a month later its discovery in China, the SARS-CoV-2 had reached several countries in Asia and Europe and in the US [5]. Only three months after its discovery, the COVID-19 was declared a pandemic by World Health Organization (WHO) [6]. The SARSCoV-2 then spread to about 216 countries and territories of the world. As of 13 January 2021, a total of 92,111,432 cases are reported, with 1,973,059 deaths worldwide with the significant number of cumulative deaths in the United States of America (389.084), Brazil (208.246), India (152.275), Mexico (139.022) and the United Kingdom (88.590) (WHO, 2021).
As demonstrated by WHO, the situation of COVID-19 infection and death has been changed throughout the year among the world's regions. Africa and the Europe Regions have suffered with the first wave of COVID cases and death followed by a decreased in these numbers, but currently, the numbers increase again since mid-September 2020. The South-Weast Asia Region experienced only one wave of COVID-19 cases and deaths, while other Regions experienced a different pattern.
Countries and territories suffered different impacts due to the SARS-COV-2 pandemic reflected by the cumulative deaths per 100 thousand population that varied enormously from 0.0 in Mongolia to 191.5 in San Marino [7]. Previous studies indicate different spread dynamics between diverse countries, maybe reflecting different degrees of efficiency in relation to the response to the pandemic [8,9]. It is essential to understand COVID-19 lethality dynamics to design more robust strategies to the pandemic.
Thus, the chief goal of this paper is to promote a systematic overview diagnosis into the COVID-19 worldwide lethality for 43 countries. For each country, we study the time series of COVID-19 daily death cases. In this way, we use evaluate the Permutation entropy (H s ) [20] and Fisher information measure (F s ) [21].
Our findings show that the countries that are located farther from the random ideal position in the SFCP such as Taiwan, Vietnam, New Zealand, Singapore, Iceland, Thailand, Cyprus, Estonia, Norway, and Latvia are characterized by a less entropy and low disorder, which leads in high predictability of the COVID-19 lethality. On the other hand, the countries that are located closer to the random ideal position in the SFCP such as Israel, Romania, Argentina, South Africa, Belgium, Iran, Czechia, India, Peru, Colombia, and Italy are characterized by high entropy and high disorder, which implies to low predictability of the COVID-19 lethality.
We provide insights that the dynamical analysis of COVID-19 lethality is a crucial approach to support public policy makers' work. In this way, it will be possible to devise more efficient strategies related to more restrictive or flexible measures to combat the COVID-19 lethality.
The rest of this paper is organized as follows. Section "Methods", exposes the theoretical framework of the Permutation entropy, Fisher information measure, and Sliding window approach. Section "Data", details the data set used in this paper. Section "Empirical results", explains our empirical results. Section "Discussion", discusses our empirical findings. Section "Conclusions", draws our conclusions.

Methods
In this section, we explain the methods that we use in this analysis. First, we present the theoretical framework of Permutation entropy (H s ). Then, we formalize the Fisher information measure (F s ). Finally, we consider the Sliding window approach to promoted a dynamical analysis considering the time-dependent between H s and F s .
Recent studies have presented empirical evidence that this complexity measure (H s ) correlated strongly with robust predictability related to ecological models [30], in anomaly detection in climatological data [31], infectious disease [32], and COVID-19 lethality [19]. The cornerstone of permutation entropy is to investigate the ordinal patterns in historical data. Thus, (H s ) taking into account an association of symbolic sequences to the segments of the historical data under examine, under investigation [33], based on the existence of local orders by comparing neighbouring values of the original series and employs the probability distribution function (PDF) related to these symbols, to measure the complexity quantifier [22].
In this way, let a time series denoted by y q , q = 1, … , Q and regard Q − (d − 1) overlapping segments Y q = (y q , y q+1 , …, y q+d− 1 ) of length d. Within each segment, the ranking of the values are carried out based in increasing order to find the indices s 0 , s 1 , ..., s d− 1 such that y q+s0 ≤ y q+s1 ≤ ...y q+s d− 1 . The respective d-tuples (or words) π = (s 0 , s 1 , ..., s d− 1 ) correspond to the original segments. We can assume any of the d! possible permutations of the set {0, 1, ..., d − 1}. Given this, the Bandt & Pompe permutation entropy (order d ≥ 2) is: where {π} denoted the summation over all the d! possible permutations of order d and p(π) consists to the relative frequency of occurrences of the permutation π. The optimal d is directly associated with the underlying stochastic process. However, to stimulate a better statistical fit as a rule of thumb, the literature suggests choosing a maximum of d to satisfy n > 5d! [22].

Fisher information measure
Fisher information measure (FIM) is a relevant approach to assess complexity [34] that presents interesting applications related to the analysis of historical data [35]. Specifically, the FIM is an effective statistical measure of indeterminacy that can be interpreted in three different perspectives: (i) as an adequate measure for estimating a parameter, (ii) as a qualitative measure associated with the amount of information extracted from a set of data, (iii) and as the measure that reveals the state of disorder of a system or phenomenon for more details, see [19]. According to Rosso et al. [36], the most relevant property of the FIM is called Cramer-Rao Bound (CRB), which is used to estimate nonlinear parameters. Therefore, the discrete normalized form of Fisher's information measure (0 ≤ F ≤ 1), is given by where p i and p i + 1 are consecutive probabilities from discrete distribution P and F 0 is a normalization constant (F 0 = 1 if p 1 = 1 or p N = 1, and F 0 = 1/2 otherwise). Then, we used the lexicographic ordering, which is a total ordering on vectors, to effectively distinguish the distinct dynamics in 2D-plane (H s x F s ). In this sense, considering a vector of dimension d = 3, words indices ri consider values from the set 0, 1, 2, and the six possible patterns are ordered as π 1 = 0, 1, 2, π 2 = 0, 2, 1, π 3 = 1, 0, 2, π 4 = 1, 2, 0, π 5 = 2, 0, 1, and π 6 = 2, 1, 0.
It is essential to mention that Vignat and Bercher [37] introduced the Shannon-Fisher causality plane (SFCP) to evaluate information content and the historical data disorder underlying. Specific, it builds-up a mathematical space, which the abscissa axis represents (H s ) and the ordinate axis represents (F s ). Here, we apply the casual version [38], where both quantifiers are calculated for Bandt & Pompe method. It has been successfully used many applications of data analysis such as Physics [39,40,41], Finance [42], and Biomedical [43].

Sliding window approach
We applied the sliding window approach to promote a time depen-dent analysis of H and F. The Sliding window approach is based on the following sequence. Considering a time series y 1 ,...,y N , we construct the sliding windows k t = y 1+tΔ ,...,y w+tΔ ,t = 0,1,...

Data
This analysis's main source is the time series of daily death numbers related to COVID-19 for 43 countries. It is crucial to mention that the data employed in this study consider the date of death and not the death record. Thus, there are no gaps in our data.
The periods cover approximately 1 year from February 19th, 2020 until January 06th 2021 with 352 observations. The data are obtained at https://covid19.who.int/.
We investigate the time series of daily death cases to COVID-19 instead of the daily number of confirmed cases. Specifically, we consider that the testing capacity covering all countries was not uniform, which implied much underreporting concerning the number of confirmed cases. Table 1 present the total death of COVID-19 by country.

Empirical results
For each country, the time series of daily death number related to COVID-19 contains 323 observations. Fig. 1 presents the timeline of daily death number related to COVID-19.
We consider the Bandt & Pompe method to evaluate the information theory quantifiers (H s ) and (F s ). Therefore, we choose embedding dimension d = 4 to satisfying the condition T > 5d! to calculate the permutation entropy (H s ) and the Fisher information measure (F s ).
Then, we use (H s ) and (F s ) to build-up the Shannon-Fisher causality plane (SFCP) to jointly quantify the disorder an evaluate randomness present in the time series of daily death cases related to COVID-19. Given this, we obtain a diagrammatic representation of the COVID-19 worldwide lethality.
We also study the behaviour dynamics of the shuffled time series of daily death related to COVID-19. Therefore, we apply the SFCP in these series, where the shuffling procedure with 1000 × N transpositions on each series. The diagrammatic representation considering the time series of daily death number for all 43 countries are shown in Fig. 2.
Based on a purely mathematical perspective, Euclidean distance is a simple yet robust technique for measuring the distance between two distinct points in an n-dimensional Euclidean space. Although there are other distances (Mahalanobis and Manhattan), the Euclidean distance is the most used due to its ease of implementation.
To provide more empirical evidence into the COVID-19 lethality, we use H s and F s to rank the COVID-19 lethality in these countries based on the complexity hierarchy. Table 2 shows the taxonomy of the COVID-19 lethality based on the complexity hierarchy (H s x F s ).
Our results suggest that the higher value of (H s ) leads to a lower value of (F s ). Otherwise, the lower value of the permutation entropy (H s ) indicates the higher value about the fisher information (F s ). Thus, it is clear that the lower entropy implies in higher predictability and higher understanding into the COVID-19 lethality derived to the more amount of information that can be extracted from this complex phenomenon.
We also promote dynamical analysis considering the daily cases of deaths for all periods considered in this study. In this sense, we apply the SFCP investigation in sliding window concerning an embedding dimension d = 4, window size = 120 (4 months), and sliding step Δ = 1 week (7 days). Fig. 3 depicts the dynamical analysis of the COVID-19 lethality using sliding window for these countries.
The information theory quantifiers were employed to understanding the predictability behaviour of COVID-19 lethality. Fig. 4 shows the dynamical interplay between (H s ) and (F s ) considering the distance from vertex (1, 0) concerning an embedding dimension d = 4, window size = 120 (4 months), and sliding step Δ = 1 week (7 days).

Discussion
We present the ranking of predictability of countries concerning the daily deaths cases by COVID-19. The nine best-ranked countries had some common characteristics that were a few daily deaths over time Fig. 1. These countries have less than 25 deaths / 100 inhabitants  Fig. 1. Timeline of daily death number related to COVID-19 from February 19th, 2020 until January 06th 2020.  accumulated; at no time did they show an explosion in deaths. In this case, we could not speak of a wave of death cases as in other countries. Consequently, comparatively, these countries have always remained well-ranked over time since the beginning of the pandemic Fig. 3. Probably it occured due to the consistent proactivity of these countries throughout the pandemic period. Lu et al. [44] highlighted the rapid response to the pandemic for Taiwan and Singapore. Baker et al. [45] also highlighted New Zealand's fast and effective response to SARS-CoV-2, as well as La et al. [46] verified cooperation between civil society, government, and private individuals in Vietnam's rapid and sustainable response to the coronavirus epidemic. Otherwise, the nine countries that presented the lowest predictability in the face of the COVID-19 pandemic exhibited many daily deaths over the time analyzed Table 1. Except for India, which presented the accumulated number>50 deaths per 100 thousand inhabitants [7]. Unlike the best-ranked countries Table 2, they presented one or more moments of an explosion in the daily number of deaths, and it can be said that these countries experienced one, two or three waves over the analyzed time Fig. 1. Italy was the first western country to experience the COVID-19 epidemic, and in the first few months, the country experienced exponential growth in the number of deaths and cases [47].
Although Brazil and the United States have a high number of deaths accumulated by COVID-19 over the analyzed time, these countries' predictability is greater than that of several countries studied. In Brazil's case, where we know better the epidemic situation of COVID-19, we believe that this is because SARS-CoV-2 initially arrived in large cities and, later on, it became internalized throughout the country differently. That it had for a long time a high and consistent number of deaths, sometimes initially fed by large urban centers, sometimes later by smaller cities. Thus, the country has not experienced several waves in the number of deaths. In a preliminary study, we analyzed the number of fatalities per Brazilian state where we could see this difference in viral dynamics between states [19].

Conclusions
This paper provides relevant empirical evidence on the predictability of COVID-19 worldwide lethality that contributes to a better understanding of this complex viral disease. Therefore, we have analyzed the time series of daily death cases related to COVID-19 for 43 countries. We use the Bandt & Pompe method to estimate the permutation entropy (H s ) and the Fisher information measure (F s ). Based on this information theory quantifiers, we build-up the SFCP, making it possible to quantify the disorder and evaluate the randomness in time series of daily death cases relate to COVID-19.
Our results indicate that the countries that are located in the SFCP near the ideal random position are characterized by a high entropy and high disorder (low predictability). Otherwise, countries that are more distant from ideal random position are characterized by a less entropy and low disorder (high predictability). It suggests that the countries were most proactive in implementing social distancing, closure of educational institutions, facemasks, lockdown, testing symptomatic and asymptomatic loads and hygienic measures to limit the impacts COVID-19 present lower entropy (higher predictability) to the COVID-19 lethality. On the other hand, the worst reactive countries in implementing these measures present higher entropy (lower predictability).
In this sense, we can observe an inverse relationship between the Permutation entropy and Fisher information measure. Thus, it is essential to emphasize that the lower entropy reflects higher predictability and, consequently, in higher understanding into the COVID-19 lethality derived from the more information extracted from the underlying stochastic process.
We also use the Sliding window approach to investigate the dynamical interplay between (H s ) and (F s ) considering the distance from vertex (1, 0) concerning an embedding dimension d = 4, window size = 120 (4 months), and sliding step Δ = 1 week (7 days). From a public health perspective, it is a useful tool to promote dynamic monitoring into the account the COVID-19 worldwide lethality. Thus, we shed light on the thorny problem related to the effectiveness of the combat measures adopted by the different countries to combat the pandemic, and lethality of COVID-19. Our findings can support decision-making by public agents regarding measures to deal with this pandemic.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.  It shows the dynamical analysis of the COVID-19 lethality using the sliding window for these countries from April 01, 2020, until May 06, 2020. (C) It presents the dynamical analysis of the COVID-19 lethality using the sliding window for these countries from May 13, 2020, until June 17, 2020. (D) It displays the dynamical analysis of the COVID-19 lethality using the sliding window for these countries from June 24, 2020, until July 29, 2020. (E) It depicts the dynamical analysis of the COVID-19 lethality using the sliding window for these countries from August 05, 2020, until September 09, 2020. (F) It emphasizes the dynamical analysis of the COVID-19 lethality using the sliding window for these countries for the three first weeks and three last weeks.