Malaria patients in Nigeria: Data exploration approach

Malaria is a life threatening disease which is usually transmitted to people through the bite of infected female anopheles mosquitoes. However, this article deals with the data exploration of malaria symptoms reported by 337 patients attended to at Federal Polytechnic Ilaro Medical centre, Ogun State Nigeria. The study covers a period of four (4) weeks monitoring of patients attendance, their consultation with physician and malaria test results as compared to their claims of malaria infection. Logistic regression was used for the basic analysis of the dataset and it was discovered that people in the age range 38–47 years are mostly affected with malaria and that females are the most infected gender species with headache being the most significant symptom based on its Wald statistic value. This study strongly recommends the introduction of a long lasting malaria prevention scheme that cut across all categories of ages and genders within the Nigerian community, and that self-medication should be seriously warned against as most claims of malaria were not actually found to be true upon verification.


a b s t r a c t
Malaria is a life threatening disease which is usually transmitted to people through the bite of infected female anopheles mosquitoes. However, this article deals with the data exploration of malaria symptoms reported by 337 patients attended to at Federal Polytechnic Ilaro Medical centre, Ogun State Nigeria. The study covers a period of four (4) weeks monitoring of patients attendance, their consultation with physician and malaria test results as compared to their claims of malaria infection. Logistic regression was used for the basic analysis of the dataset and it was discovered that people in the age range 38e47 years are mostly affected with malaria and that females are the most infected gender species with headache being the most significant symptom based on its Wald statistic value. This study strongly recommends the introduction of a long lasting malaria prevention scheme that cut across all categories of ages and genders within the Nigerian community, and that selfmedication should be seriously warned against as most claims of malaria were not actually found to be true upon verification.
© 2019 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons. org/licenses/by/4.0/).

Data
The data set used in this article was collected as a secondary data from Federal Polytechnic Ilaro Medical centre, Ilaro Ogun state, Nigeria and it contains information on 337 patients who presented themselves for consultation on malaria related infections. The symptoms reported by the patients were recorded and information about the same patients were collected after been tested for malaria. These patients are between the ages of 3 and 77 years of whom 180 are females and 157 are males, and their data was collected for a period of 4 weeks. The recorded symptoms as reported by the patients were all compared with the results of the malaria test, and the results of the malaria test was used for the target variables.
This dataset consist of 15 malaria symptoms which are "Fever, Cold, Rigor, Fatigue, Headache, Bittertongue, Vomiting, Diarrhea, Convulsion, Anemia, Jaundice, Cocacola-Urine, Hypoglycemia, Prostration, and Hyperpyrexia" as collected. From the dataset, Ages of the patients are recorded in years while gender were encoded in ordinal form as "0" for Male and "1" for Female. Other features are encoded in Specifications Table   Subject Medicine Specific subject area Epidemiological, Public health, Biostatistics Type of data All the data are available in this data article as supplementary materials

Value of the Data
The data on malaria infection could be useful for government and health workers to make decisions that would reduce the risk of malaria infection among the populace. This work provides a deeper understanding of the prevalence and prognosis of malaria infection. The data can be useful in malaria infection awareness, management and treatment.
The data could be used as a baseline for comparison in future studies. The data reveals high significant impacts of prevalent factors such as headache, pain, fever, cold etc. on malaria morbidity       integers ("0" for non-presence and "1" for the symptoms presence). This raw dataset which has been approved by the medical director, representing the institutional bioethics committee is available and can be assessed as Supplementary data. Descriptive analyses were performed and logistic regression analysis was also used to describe and analyze the data set. The data is summarized under different classifications which are: classification based on gender (sex), malaria infection classification for age, classification of malaria infection by sex and classification based on some common malaria symptoms.   Table 7 Test of model coefficients.

Omnibus Tests of Model Coefficients
Chi-square df Sig.
Step  Table 8 Model summary.
Step À2 Log likelihood Cox & Snell R Square Nagelkerke R Square 1 404.614 a 0.083 0.115 a Estimation terminated at iteration number 4 because parameter estimates changed by less than .001.

Table 9
Hosmer and Lemeshow test.
Step Chi-square Df Sig.  Table 1, it can be seen that the mean age of the patients is 30.35 years, the minimum and maximum ages are 3 year and 77 years respectively. The data set is slightly positively skewed and leptokurtic with a coefficient of Skewness and kurtosis of 0.755 and 0.536 respectively.
A diagrammatic representation of the age distribution and age range of the patients is as shown in Figs. 1 and 2 respectively. The age of the patients were classified into eight different groups (or classes) and the respective frequencies are as shown in Table 2. It can be seen from Table 2 that majority (50) of the patients are in the age group 38e47 years which is approximately 15% of the total population. The diagrammatic representation of the information in Table 2 is as shown in Fig. 2.
Information on the gender is as shown in Table 3 and the respective frequencies are also displayed. From Table 3, it can be seen that most of the patients were female. The diagrammatic representation is as shown in Fig. 3.

Analysis on malaria diagnosis using logistic regression
Information on the diagnosis of patients who presented themselves for malaria treatment was shown in Table 4 and it was observed that only 116 of the 337 reported cases were actually found to be infected with malaria, of which most of them are female. The diagrammatic representation of Table 4 is as shown in Fig. 4. It was observed that in Fig. 5, the chart of the predicted probabilities gave a Cut Value/threshold of 0.5 and the goodness of fit test was carried out using Hosmer and Lemeshow Test.

Experimental design, materials and methods
This article shows the strength of the significant level of the perceived as well as diagnosed malaria symptoms using logistic regression analysis. It equally examined the linear relationship between the malaria predicted binary classes. Research on malaria has been a great concerns to government and world health organizations. According to Ref. [1], there were estimated deaths of 435,000 from malaria globally in 2017, compared with 451,000 estimated deaths in 2016, and 607 000 in 2010. According to researches, several aspect of malaria prediction method has been studied. And different forms of dataset have been used such as malaria cell image dataset and different forms of numerical dataset.
Artificial neural networks, Machine learning/Data mining and deep learning methods has been helpful to previous researchers in predicting malaria outbreak/infections in different regions and community all over the world. Some have gone as far as using geospatial based and weather based dataset in predicting malaria which has been a very huge success in previous years and different recommendation have been made [1e9].
Malaria is transmitted exclusively through the bites of Anopheles mosquitoes. The intensity of transmission depends on factors related to the parasite, the vector, the human host, and the environment. Symptoms of malaria include fever, headache, and vomiting, and other listed symptoms in the dataset which usually appear between 10 and 15 days after the mosquito bite. If not treated, malaria, more so falciparum malaria, can quickly become life-threatening by disrupting the blood supply to vital organs [10e14].
Chi-square test of independence can equally be used to analyze the data collected. For instance, a cross-tabulation of gender and Malaria outcome of the patients after been tested can be classified into contingency table as shown in Table 4. In this research however, logistic regression analysis was used to analyze the data set. Table 5 shows the classification table at step 1. Table 6 shows the variables in the equation at Step 1. Table 7 shows the omnibus tests of model coefficients. Table 8 shows the model summary using the log-likelihood, Cox & Snell R square and Negelkerke R square. Table 9 shows the Hosmer and Lemeshow Test. Table 10 shows Contingency Table for Hosmer and Lemeshow Test.  Table 11 shows the classification table for all the step 1. Fig. 5 shows the diagram of predictive probabilities.