Breast cancer patients in Nigeria: Data exploration approach

Breast cancer is the type of cancer that develops from breast tissue; it is mostly common in women and it is one of the most studied diseases, largely because of its high mortality (second to lung cancer). However, it occurs in males also. This article presents a statistical study of the distribution of age, gender, length of stay, mode of diagnosis, status (dead or alive) after treatment and the location of breast cancer among 300 patients admitted in the University of Ilorin teaching hospital, Ilorin, Nigeria. The study covers a period of five (5) years; from 2011 to 2016 and logistic regression was used to perform the basic analysis in this study. It was discovered that the age of patients and the location of the breast cancer (right or left) contributes significantly to the survival of the patients. However, early detection and treatment of the disease is highly encouraged. This study also recommends that awareness should be taken to the grassroots and males should not be excluded from this discussion.


a b s t r a c t
Breast cancer is the type of cancer that develops from breast tissue; it is mostly common in women and it is one of the most studied diseases, largely because of its high mortality (second to lung cancer). However, it occurs in males also. This article presents a statistical study of the distribution of age, gender, length of stay, mode of diagnosis, status (dead or alive) after treatment and the location of breast cancer among 300 patients admitted in the University of Ilorin teaching hospital, Ilorin, Nigeria. The study covers a period of five (5) years; from 2011 to 2016 and logistic regression was used to perform the basic analysis in this study. It was discovered that the age of patients and the location of the breast cancer (right or left) contributes significantly to the survival of the patients. However, early detection and treatment of the disease is highly encouraged. This study also recommends that awareness should be taken to the grassroots and males should not be excluded from this discussion. &

Value of the data
The data on breast cancer could be useful for government and health workers to make decisions that would reduce the risk of breast cancer among the populace.
The data provides the analysis of the age, gender, location of the breast cancer, mode of diagnosis, length of stay (LOS), outcome of treatment of breast cancer patients for the population studied.
The data can further be analyzed using other statistical tools like chi square test, multiple linear regression and Poisson regression analysis.
The result from the analysis can be compared with other oncologic studies. The interpretation of the data could be helpful in educational studies, epidemiologic oncology, molecular pathologic epidemiology, and breast cancer awareness, screening and so on.
The study can be replicated or extended to longitudinal studies. The article provides insight on the impact and consequence of age and location of breast cancer on the survivability of breast cancer patients.

Data
The data set used in this article was collected as a secondary data and it contains information on 300 breast cancer patients. The data set was obtained from the Cancer Registry Department under the Department of Admission and Discharge Unit, University of Ilorin Teaching Hospital (UITH) Ilorin, Nigeria. It involves information on 275 females and 25 males and it covers a period of five (5) years; from 2011 to 2016. The patients were all treated as in-patients and were later discharged, of these, 97 patients were discharged dead while 203 patients were discharged alive. The raw data is available and can be assessed as Supplementary data.
Descriptive analyses were performed and logistic regression analysis was also used to describe and analyze the data set.
The data is summarized under different classifications: gender (sex), location of the breast cancer, mode of diagnosis, survival after treatment, age and length of stay in the hospital during treatment.

Analysis of age of the patients
The frequency table showing the analysis of the age of all the 300 patients is shown in Table 1. In Table 1, it can be seen that the mean age of the patients is 49.71 years, the minimum and maximum ages are 20 years and 96 years respectively. The data set is slightly positively skewed with a coefficient of skewness of 0.572.    A diagrammatic representation of the age of the patients is as shown in Fig. 1.
The age of the patients were classified into three different groups (or classes) and the respective frequencies are as shown in Table 2.
It can be seen from Table 2 that majority (115) of the patients are in the age group 41-55 years which accounts for 38.3% of the total population under study.
The diagrammatic representation of the information in Table 2 is as shown in Fig. 2.

Analysis on length of stay of the patients at the hospital
Information on the length of stay of the patients in the hospital before discharge is as shown in Table 3 and the respective frequencies are also displayed.
From Table 3, it can be seen that most (106) of the patients were discharged early and particularly in less than 11 days.
The diagrammatic representation is as shown in Fig. 3.   Sex  female  188  87  275  male  15  10  25  Total  203  97  300  Table 7 Classification Table. Classification   Table 9 Tests of model coefficients.

Omnibus Tests of Model Coefficients
Chi-square df Sig. Step

Analysis on the gender of the patients
The information on the gender of the patients is as shown in Table 4. It can be seen in Table 4 that majority (275) of the patients are females. Also, the table revealed the incidence of breast cancer among male patients.
The information in Table 4 is represented diagrammatically in Fig. 4.

Experimental design, materials and methods
Research on breast cancer and other form of cancer are intense because of the high fatality rate of the disease if not properly managed. Several aspects of breast cancer has been studied, some of which have generated data sets. The analysis on those data sets is based on the various experimental designs, research materials and referred scientific methods. Some of such areas are: CT images, growth factor levels in incident breast cancer, hormone receptor status, cytokine circulation, secretagogue users in breast cancer treatments, chemokine levels, breast cancer and diabetes mellitus coinfection and treatment, breast cancer and HIV treatment, breast cancer and pregnancy. Others are: proteome analysis, risk factors analysis, breast examination, screening, management and breast cancer awareness, epidemiology, risk assessment tools, treatment options: radiotherapy treatment versus chemotherapy, survival analysis, breast cancer subtypes, biomarkers, socio-cultural barriers to treatment, socio-demographic factors and alternative medicine approach, genetic risk, dietary patterns, early diagnostics and treatment and others . Table 12 Hosmer and Lemeshow Test.

Hosmer and Lemeshow Test
Step Chi-square df Sig. Chi-square test of independence can be used to analyze the data collected, for instance, a crosstabulation of gender and outcome of the patients at the point of discharge can be classified into a r x c contigency table as shown in Table 5. In this research however, logistic regression analysis was used to analyze the data set. See similar analysis in [27][28][29][30] Table 6 represents the coding for variables length of stay, age, location of cancer, mode of diagnosis and gender of the patients. Table 7 shows the classification table at step 0. Table 8 shows the variables in the equation at Step 0. Block 1: Method ¼ Backward Stepwise (Conditional).  Table 9 shows the omnibus tests of model coefficients. Table 10 shows the model summary using the log-likelihood, Cox & Snell R square and Negelkerke R square. Table 11 shows the variables in the equation from Step 1 to Step 4: Table 12 shows the Hosmer and Lemeshow Test. Table 13 shows the classification table for all the steps; steps 1-4. The predictive probability is as shown in Fig. 5.
Breast cancer is one of the dangerous diseases. It occurs in both males and females but the incidence is more in females. Based on this present study, the age of the patient and the location of the breast cancer (right breast or left breast) both contribute significantly to whether a patient would survive the breast cancer disease or not.