Visualising Developing Nations Health Records: Opportunities, Challenges and Research Agenda

The benefits of effectively visualizing health records in huge volumes has resulted in health organizations, insurance companies, policy and decision makers, governments and drug manufactures’ transformation in the way research is conducted. This has also played a key role in determining investment of resources. Health records contain highly valuable information; processing these records in large volumes is now possible due to technological advancement which allows for the extraction of highly valuable knowledge that has resulted in breakthroughs in scientific communities. To visualize health records in large volumes, the records need to be stored in electronic forms, properly documented, processed, and analyzed. A good visualization technique is used to present the analyzed information, allowing for effective knowledge extraction which is done in a secured manner protecting the privacy of the patients whose health records were used. As research and technological advancement have improved, the quality of knowledge extracted from health records have also improved; unfortunately, the numerous benefits of visualizing health records have only been felt in developed nations, unlike other sectors where technological advancement in developed nations have had similar impact in developing nations. This paper identifies the characteristics of health records and the challenges involved in processing large volumes of health records. This is to identify possible steps that could be taken for developing nations to benefit from visualizing health records in huge volumes.


INTRODUCTION
The global pandemic caused by the Coronavirus in 2020 can be seen as an indication that more needs to be done to support the development of healthcare platforms that transcend national or state borders. Health records are private, yet contain information that could be very beneficial to the patient and society at large when analyzed [7,47,59]. Therapeutics, monitoring, treatment and care of patients in the healthcare industry produce vast amount of data currently estimated to be in Exabytes yearly, [47] applying techniques in big data research to health care data has been shown to improve understanding of health challenges, its monitoring and prediction [47,60,61]. Big data research has received a lot of attention in recent years [40] and it been listed in top 10 Critical and strategic technology trends in the last decade [39]. Kalantari et al [27] pointed out in their bibliometric analysis of big data that top countries are publishing a vast number of articles in the field while there is an essential lack of interest in the field in 96 other countries with no publication. Big data research has proven to be very beneficial, transforming the way several problems are currently being approached in several fields including healthcare [35], a key area in big data research is its visualisation. Big data visualisation is the graphical representation of data using visual elements providing means for the extraction of information and enabling one to make broad predictions [28,35] Visualising health record provides useful insight with numerous benefits to the society [4,19]. Visualization as an effective tool in disseminating big data has been gaining traction in the health sector because it protects the identity of patients as the data isn't viewed individually but collectively and it allows both scientist and the general audience to understand health data, it also detects hidden patterns and correlations, [8,54] and spots meanings that other machine techniques might not spot. By visualizing health records of a locality, predictions of a disease outbreak can be determined faster; drug behaviors and any potential outbreak of an infection could also be quickly noticed and effectively monitored on a large scale. This survey paper identifies the opportunities and challenges of visualizing health records in developing nations. Like never before, people are not only bothered about their own health, loved ones' health or their own nations' health outlook but they are also concerned about the health situation in other nations, raising concerns about the accuracy of what is being reported in developing nations [16].Just as big data changed the way we do business and management in large organizations [39], having national and global visualization platforms in the healthcare sector could transform the way healthcare research is performed, the way funding is channeled, and the way future pandemics could be prevented. By seeking to understand and identify challenges that could be unique to developing nations and identifying factors hindering the development of a reliable large scale or centralized visualization strategies, solutions could be proffered to address these challenges. Our contributions in this study are: (1) We broadly identified the challenges involved in Visualizing patients' health records and steps that have been taken so far to address these challenges. (2) We identified challenges unique to developing nations with regards to the handling of health records and the dissemination of information from health records using visualization techniques in the healthcare sector.
Big data and electronic health record have received lots of attention over the past decade but little has been done in developing nations [27], This paper therefore focuses on ensuring that developing nations, irrespective of the challenges they face locally, could utilize some of the advancements made in the technological field to the health care sector enhancing the use of visualization to extract useful knowledge and ensure the dissemination of information. This paper is organized as follows: Section 1 is the introduction, section 2 provides an overview of health records, how they relate to big Data and visualization of health records. Health records characteristics are discussed in section 3, while section 4 focuses on processing health records. Management challenges are discussed in section 5; Section 6 discusses challenges that are unique to developing nations and section 7 is the conclusion.

OVERVIEW
Governments and individuals spend a lot of resources in the health sector, a challenging service affecting patients' life and economies. [25] Information is extracted from patients to help improve their health situation and this information can also be used for various other services; as such the manner in which the information is documented and stored is important. The use of electronic technologies to improve health care has been embraced over the years due to its numerous benefits, and health providers have been encouraged to adopt electronic health records though several health providers especially in developing nations, are yet to do so. Extracting valuable information form these records could have significant medical and economic benefits. [5,6].

Health Records
Health records are records generated from patients diagnostic services and stored by health professionals or the use of sensors attached to persons. They contain a person's social demographics, family history, lab reports, imaging data, genomics data, endoscopic data, colonoscopies data, clinical notes, etc. and are estimated to currently be in Zettabyte [47]. These records are not just big, the are complex, in different formats and could be structured or unstructured.
Health record can be divided [3] into two parts -the creation part and the access part. Creation part involves the direct interaction of patients with healthcare professionals or via the use of electronic data capture devices such as sensors. It also involves how the data is formatted as well as the storage medium. Various standards and policies have been established to ensure the records created meet required expectations. Access part is responsible for who has access to the data stored, ensuring the security and confidentiality of health records. Challenges associated with health records [47]include - • Correctness-How accurate is the collected data is often a challenge, as incorrect data could lead to wrong outcomes. • Incompleteness-Patients health records are often incomplete. • Inconsistency -Multiple person associated with the creation of health records often lead to inconsistency. • Privacy -Privacy concerns are often raise as health records may be shared or linked without consent or authorisation. • security -The vulnerability of health data is often an issue as criminals often target these data as the contain highly valuable information.
Diverse numbers of systems have been developed to handle the management and storage of these health records once created. Hospitals, laboratories, Radiology, Clinics among others have all developed their systems to handle such records [59]. The International Organization for Standardization (ISO) Committee Draft Technical Recommendation 20514, ISO defined Electronic Health Records (EHR) broadly as "a repository of information regarding the health status of a subject of care in a computer process-able form, stored and transmitted securely and accessible by multiple authorized users". Making the most from health data would involve several crucial but important steps that every health provider would need to partake in as described in section 6.

Health records as big data
Big data definition as proposed by Andrea et al "is the Information asset characterized by such a high volume, velocity and variety to require specific technology and analytical methods for its transformation into value" [15], Radhika et al [40] suggested that a 100MB file to be sent via email could be considered big data since attachment capacity for email is restricted to 25MB. Seagate's report predicts that by 2025 data creation would swell to 163 zettabytes(ZB) [14], Electronic health records have witnessed tremendous growth, from 500 petabytes in 2012 to several zettabytes(ZB)in 2020 [35,59]. With the massive amount of data currently being captured from health records, the current challenge is extracting meaningful information that could be acted upon. Uncertainty however remains about the use of big data analytics on health records especially with regards to its governance. [6,60]

Visualizing Health records
Health records visualization entails the use of tools that produce graphics, images, diagrams or animation for capturing information from health records, displaying an easily understandable information and allowing for interaction with the data thus facilitating the discovery of knowledge [35,59]. A good health records visualization allows for quick analysis of health records and visual story-telling capturing the attention of an audience and allowing for easy extraction of knowledge at a faster rate with less mental work. [13,35] A good health records visualization technique requires the use of an effective and interesting visualization type.
Examples of visualization techniques are word cloud, dashboards, charts tables, graphs, maps, info-graphics, area chart, bar chart, box-and-whisker plots, bubble cloud, bullet graph, circle view, dot distribution map, heat map, histogram, matrix, polar area, radial tree, scatter plot (2D or 3D) and tree-map.
To create a Good health records Visualization [33,52,53], the designer must consider in the following: • what are the questions they targeted individuals want to solve. • what questions should the visualization answer.
• what other question could be inspired by visualization.
• how do I get the targeted individuals inspired. Once these questions have been answered, the designer must then • Choose the right visualization technique for the job. For example, histograms conveniently show clustered data while bar charts are effective for comparison. • Make it meaningful, choosing a pattern that could be easily followed, passing as much information as possible at a glance [13]. • Incorporate clues using colour, shapes, designs, size and text carefully and intentionally. • Eliminate clutter.
• Add interactivity ensuring it allows for qualitative and quantitative analysis • Ensure the latency is as low as possible as delays in displaying a visualisation could result in loss of interest. To facilitate analysis of big data like health records several techniques have been developed [19]; these include FIsViz, WiFIsViz, FpVAT, PyramidViz, FpMapViz and HSLviz. The use of Virtual reality [54]has also been explored and some of the most popular visualization tools [34,44] are Tableau Software, Qlik and TIBCO Software, Zoho Analytics, IBM Cognos Analytics, Microsoft Power BI, Plotly, Gephi and Oracle Visual Analyzer. Benefit of Visualizing health records include • Risk factor identification : Identifying possible risk is often a challenge; visualization however makes it easy to observe risk trends. • Survival prediction and mortality : By simply observing trends in health data visualization, prediction of risk without much calculations is easier. • readmission or length of stay prediction : Hospital readmission is common and expensive [18], predicting potentially avoidable readmission benefits both the patient and aids in the effective utilization of healthcare resources. • Early warning: Visualization tools are very powerful tools for displaying the cause-effect relationships between different events allowing for anticipation of future occurrences [11].

HEALTH RECORDS CHARACTERISTICS
Health records characteristics arise from the life cycle of the health record. Just as different researchers have distinct understanding on the characteristics of big data, health records characteristics can be described using 3V's, 5V's and 7V's with each V representing a unique characteristics of the health record. The V's represent a characteristics of the record where 3V's often describes volume, velocity and variety, 5V's describe velocity, volume, variety, veracity and variability. The 7V's are discussed below [50]:

Volume
Volume is the total amount of health records produced measured in bytes. It is impractical to define a specific threshold because as technology improves so does storage capacity increase allowing for the storage of even more data. Processing speeds have also been increasing allowing for the processing of larger amount of data in real-time; as a result, what we may is consider big data today may not be considered big data in the future [21].

Velocity
Velocity is the rate at which health records is generated, processed, and analyzed. Analyzing unstructured data on a large scale was a challenge for most organizations hence they were just stored in their databases and analyzed in bits but with technological advancements, especially machine learning techniques, more valuable information are being extracted from both structured and unstructured data at a much faster rate. The use of sensors and other electronic data capture devices have increased the amount of data being generated hence real-time analysis of health record is a tedious task especially for time sensitive data [21].

Variety
Variety deals with the structural heterogeneity's of data, the fact that health records are generated and stored in various formats that may be structured, semi-structured or unstructured. Structured health records include laboratory results, vital signs, International Classification of Diseases (ICD) code. Unstructured data include narrative information (free text) such as images and graphics, visit notes, discharge summary, chief complaint. Semi-structured materials include prescriptions and medications [20].

Variability
Variability deals with different means health records could possess even if the look similar. Trying to analyze health records in bits could lead to wrong conclusions. However, combination of several forms of information provides for better understanding of the situation. For example, a doctors' report for a patient and a few laboratory results could inform a decision on a likely treatment course but once the patient provides information about their family history, a different treatment course is explored. Thus, accounting for variability in health records is very important [50].

Veracity
Veracity is the accuracy of health records. It entails more than just the data quality but an under-standing of the data, discovering discrepancies as health records often are incomplete and contain inaccurate information [47,50].

Visualisation
Visualising health records enables the capturing of meaningful information from health records as has been described in Section 2.3.

Value
Value includes actionable information, knowledge or any useful information obtained from health records. It is basically the reason for health records. Health records are very valuable as they are a repository of information that once extracted often have significant benefits to the society. Other than its primary use in patient personal care, knowledge that is extracted could be beneficial for planning future services, resource allocation, developing standards, preventive campaigns, health facilities audit, clinical research, amongst others. [47].

HEALTH RECORDS PROCESSING CHALLENGES
This section explores the challenges involved in processing health records to obtain valuable information. Health records Processing begins with the capture of patients' information, storing the information, processing the stored information and finally extracting knowledge from the information. Researchers have proposed solutions to several of the challenges in processing health records as seen in Fig 1. These challenges are further discussed below and are often similar to those encountered in processing big data. [50].

Health Records Generation and Storage
Health records are generated from several sources, primarily these records are generated at the instance of a health professional in a health facility following some set standards or guidelines, how health records are generated and stored determine the ease of extracting information from them. Some health organizations use electronic means to store information, and these often have numerous benefits as opined by Shahid |& Rizwan. Electronic devices tend to always follow a similar pattern or guideline as directed by the device being used and part of the information extracted may still be unstructured, there is still an easy way to follow structure hence processing data generated and stored electronically is faster more accurate and widely adopted in developed nations. However, several developing nations still store health records manually often using paper. Paper based records face problems such as records getting missing, alterations and difficulty in transporting or sharing [26,45]. Patil, Sunil et al [51] conducted a survey to recognize the benefits of using electronic devices to store health records and about three-quarters of those surveyed opined that using electronic devices is important. The early 1990's witnessed the introduction of electronic health records and though there have been enormous advancements in the field, Evans R [17] opined that a lot still needs to be done as expectation have not been met and the needs of today's rapidly changing world, have not been met.

Health records cleaning, Integration, Aggregation and mining
Cleaning is a process of fixing or removing incorrect, corrupted, incorrectly formatted, duplicate, or incomplete data within a data-set, Integration entails the merging of health records into a centralized system or creation of approaches that support the use of distributed systems facilitating access to information by authorized individuals, Aggregation deals with preparing integrated health records for information processing and data. Mining is the process of extracting knowledge or actionable information from raw data, all these four elements are essential in the preparation of health records [50]. Health records, irrespective of the storage platform being used, are often heterogeneous: they may be structured, unstructured or semistructured and are geographically distributed.

Health Records Modelling
Health records model is a conceptual representation of the various elements in a health record, the relationship between them and the rules. Health record modelling is the process of creating a health model. Current health models are developed and evaluated using machine learning techniques, here the health record is typically split into a training set and test set. The training set is used to train a model optimizing its parameters while the test set is used after training has been completed to evaluate the model [48]. Challenges in health record modelling can be attributed to high dimensionality, heterogeneity, lack of fixed standards, sparsity and systematic bias of health records [22,57] but advancement in deep learning and artificial intelligence help in overcoming these challenges [30].

Health Records Analysis
Health Records Analysis is the extraction of useful information from health records, health records analysis has multiple facets and approaches often using health record models. Deep learning techniques [30] could be used in health Records Analysis without human input resulting also in the automatic detection of hidden patterns and relationships between features while creating predictors. Deep learning techniques such as recurrent neural networks (RNNs), autoencoder, deep reinforcement learning, and convolutional neural networks (CNNs) [30,43], are currently being used in health record analysis for example CNNSoftmax an end-to-end deep similarity learning approach for predictive analysis uses patient representation learning and patient similarity learning uses convolutional neural networks (CNNs). Deep learning techniques [30] are popular because the can - • : Analyse multiple data types.
• : detect hidden patterns and relationships in features.

Health Records Interpretation
Health records interpretation is simply the process through which information extracted from an analyzed health record is presented to an authorized person in an understandable format for the purpose of making an informed conclusion. Presenting health records analysis in a format that allows for easy, quick and accurate grasping of information requires that several factors be considered as conclusions arrived at from the interpretation of health records could have significant implications hence it is very important to ensure the results of health records analysis are presented in the best way possible highlighting the correlation, causation, coincidences and biases. One way of ensuring that as much information as possible is presented in a manner that is easy to assimilate when presenting health records analysis is through the use of a good visualization technique [8,33,44].

HEALTH RECORDS MANAGEMENT CHALLENGES
Health records management challenges are challenges that are encountered after health records have been created. These challenges include managing who can access health records, securing the health records, maintaining the privacy of patients, among others. There are several factors to consider with regard to health records management; the factors include who owns the health record, the security of the health record, established laws guiding the handling of health records, protection of patients privacy, sharing of patients record and cost associated with maintaining health records among others. These challenges will be briefly discussed.

Ownership
Health records contain valuable information that are also sensitive; for this reason, handlers of such records must ensure that they are kept secured and that privacy is maintained. Disputes may arise as to who owns health records, does the record belong to the patient, the healthcare organization, the healthcare insurance, the government or is it jointly owned by some of the key players or all of them. Different countries have assigned ownership differently hence one needs to ensure that before attempting to perform any analysis on health records one should ensure that all the necessary approvals are gotten. [56]

Security
Securing health records is very important so also is ensuring that network, hardware, operating system, web, cloud and any vulnerable point of attack in healthcare facilities. Data security has evolved from using encryption alone to the adoption of several techniques [12]. Cryptography in used to secure health records against unauthorized access to private information, a popular technique cryptography technique is Hill cipher [38]. Two-factor authentication protocol is also an efficient and secure access protocol developed and adopted in the healthcare sector [49]. Also Blockchain is a decentralized technology that can be used as the foundation of a trustful and immutable system. The potential of blockchain has tested in the handling of health records [32], Selecting the most effective blockchain model for the health system is however where the challenge currently lies as there are several blockchain-based applications, such as electronic medical record (EMR), MEDREC, MediLedger, Healthcoin [24] Various techniques to ensure data security such as techniques that could support rural healthcare systems [18], detect abnormal behaviour [61] have been developed. Securing health records has received a lot of focus and techniques that suit almost all situations have been developed but the need for further research to mitigate against future challenges exist as new means to breach existing systems are constantly being explored.

Privacy
Maintaining the privacy of a patient's record is important and often a significant concern to patients [25], Health records contain sensitive information the patients often want kept private and if exposed could carry social humiliation, identity theft, economic impact, among others. Several techniques have therefore been developed to protect patients' identities; some of the most commonly used techniques include - • De-personalization -Removing as much personal identifiers from a patient medical record [42]. • Anonymization -The use of techniques such as suppression, generalization and slicing [31]. • Pseudonymization -Replacing patient identification data is replaced with an identifier.

Governance
Government and health boards including the World Health Organization all over the world have established various standards, guidelines or set of practices health workers are expected to follow especially when making critical decisions as these decisions could have significant negative effect. Below are some of the more popular health standards- • General Data Protection Regulation (GDPR) in Europe [1,47].

• Electronic Health Record Architecture and Data Standard in
China [60]. • eHealth-law, a "law on safe communication and applications in healthcare" In Germany [26].

• Health Information Technology for Economic and Clinical
Health Act (HITECH) [29]. • Patient Protection and Affordable Care Act (PPACA) [29]. Governance also includes monitoring the health records and how they are processed to ensure that desired outcomes are meet; these are often done in form of an audit.

Sharing
The need to share health records across clinics, hospitals, pharmacies, laboratories, researches among others often arises [3],hence the need to ensure that those authorized to access and handle these records do so in a confidential manner following strict guidelines or standards. Without established standards or guidelines for sharing of health records between health facilities, these health facilities find it difficult to share data due to their need to ensure that the privacy of their patient is protected. A study estimated that 78 billion dollars could be saved annually if data exchange standards were utilized across board [29]. For example the European union (EU) understanding the diversities and peculiarities of member states, issued directives allowing member states via coordinated actions to expand the use of interoperable Electronic health records under its eHealth Action Plan 2012-2020 [26,46].

Cost/Operational Expenditures
With health records constantly on the increase, storage and processing capacities need to be improved to handle the amount of data generated in an effective manner. Acquiring, setting up and managing storage and processing equipment that can handle the huge amount of data produced in health centers is expensive. Researchers and managers of health care facilities are constantly looking at ways to reduce the cost of managing such data centers or to seek for funding from other sources such as state or national government, non-governmental organizations and private establishments.

DEVELOPING NATIONS
There is no clear definition universally accepted for developing nations nor is there a clear agreement on which countries fit this category. In this study, we decided to classify countries with low Human Development Index (HDI) relative to other countries as developing nations. A questionnaire was developed carefully and given to health care professionals in developing countries to provide information on how generated data is stored, the quality of the data stored, and electronic sharing of patient's data, among others.
Physicians and surgeons formed the major portion of those who responded to the questionnaire accounting for 79.6% of the responses, 11.8% was responses from nurses, nurse assistants and midwifes while other categories such as pharmacists, records managers, dentists, optometrists, sonographers, healthcare administrator and general practitioner accounted for a little above a % each. To understand the challenges hindering visualizing developing nations health records, the challenges involved in processing and managing health records were considered. Solutions to these challenges have been developed to some extent and researchers are actively seeking ways to further improve these solutions. The questionnaires that were issued to health care professionals in developing nations only focused on challenges that involve human input. Challenges hindering processing and managing health records-

Generation and Storage
To better understand how health records are captured and stored, questions posed to asked of health professionals investigated how they collected information they recorded in the health records, how these records are stored, and their ability to use electronic devices to capture information.

Figure 3: Storage Method Used
The results showed that manual storage of health records is still dominant, accounting for 45.2% of the total storage, electronic means account for just 10.8% while 44.1% use a combination of both means. An encouraging observation was that 76% of the respondents noted that their health facility is either planning to adopt electronic means to store records or had begun the process. Their responses also clearly showed that most health professionals understand how to use electronic devices to upload information effectively.
Dr Michael Kenechukwu Caled, a medical doctor in Nigeria, stated that "the cost and logistics involved in purchasing and setting up computers and other electronic devices in health facilities is the toughest step in transitioning from manual to electronic means. Should researchers develop tools that would enable health professionals use mobile phones and existing devices to electronically store health records, the adoption of electronic means would greatly improve". Researchers also need to create tools that can capture non digital results such as Xray's, manually documented report among others, so that even though records are generated manually. They can still be transferred to electronic forms.

Records cleaning, Integration and Aggregation
To understand the quality of the data and if they were integrated properly certain questions were asked.

Figure 4: Storage Method Used
It was observed that the quality of the health records is questionable as 88% of these workers have encountered situations where the records were in such poor states that they could not be used. It was also observed that less than 8% of the health professionals store information in the cloud, and less than 17% of the stored information are shared electronically. This implies that these records need to be cleaned to improve their quality. Integrating and aggregating health records currently in developing nations is therefore still very much a challenge.
Developing techniques to improve the accuracy of health records may need further research as syntax and semantics of health professionals in developing nations may differ from those used by those in developed nations. Natural language models like that of Bert and Elmo which were recently developed can be tested to see if they can be used to further improve the accuracy of health records.

Records Modelling
Health records modeling utilized techniques that could easily be transferred, and which have no unique characteristics for identification. Hence, it was not further investigated.

Records Analysis
Similar to health records modelling, health records analysis was not further explored.

Records Interpretation
Though there are several ways to present information obtained from analyzing health records for interpretation, using visualization techniques have proven to be the most effective. Developing nations may have difficulty affording powerful devices that could handle the processing of these visualization hence researchers need to proffer solutions that would allow developing nations use their existing devices and still visualize these health records.

Management Challenges
There are several management challenges hindering health records visualization in developing nations. Though these challenges are not unique to developing nations, developed economies have been taking steps to address these challenges. Developing nations need to follow suit. Challenges we suggest developing nations need to address for adequate visualization of health records are: • Ownership -Health care organizations need to be permitted to analyze patients' health records even without their consent in a manner that patients' privacy is protected. To achieve this existing standards need to be modified to encourage research while ensuring that patients privacy is protected. • Sharing -With little or no sharing of health records via electronic means among developing nations, the need to encourage the use of already developed techniques that allow for easy sharing of information needs to be explored. • Cost/Operational Expenditure -Just as in developed nations, health centers should be sup-ported financially to transition from manual storage to electronic methods. • Governance -The number of health professionals that don't follow strictly the guidelines as seen in figure 4 is too high. Authorities need to find ways of encouraging them to follow established standards and guidelines.

CONCLUSION
The process of extracting knowledge from data in the last decade has witnessed transformation as a result of successes in machine learning algorithms and artificial intelligence in almost all sectors including the health care sector. Visualization techniques have been used to present the information in a manner that allows for efficient extraction of knowledge or actionable information. In this survey paper, we gave a brief overview of health records, its characteristics and both processing and management challenges. We then showed that technological advancements have not improved the way knowledge is extracted in developing nations and suggestions were made to remedy the situation as the current global pandemic has shown that a health crisis in one nation could impact another or other nations.