Methodological Challenges and Lessons Learned from Assessing the Routine Health Information Management Data Quality: Experience from Tanzania

Background: Globally, there has been an increase in the demand for quality data for evidence-based planning. One of the sources of data is the Routine Health Information Systems (RHIS), which collect the information regularly on agreed upon schedule and by using standardized tools [1]. Despite being criticized for its inaccuracy, the RHIS data has the potential of gathering the much-needed data to inform the planning of health intervention beyond providing data on services utilization [1-5]. Institutions have pioneered the tools and frameworks for assessing the data quality and improving the performance of the RHIS [1,6,7]. However, most of these tools examine overall systems performance rather than the source documents such as registers [1].


Commentary
Globally, there has been an increase in the demand for quality data for evidence-based planning.One of the sources of data is the Routine Health Information Systems (RHIS), which collect the information regularly on agreed upon schedule and by using standardized tools [1].Despite being criticized for its inaccuracy, the RHIS data has the potential of gathering the much-needed data to inform the planning of health intervention beyond providing data on services utilization [1][2][3][4][5].Institutions have pioneered the tools and frameworks for assessing the data quality and improving the performance of the RHIS [1,6,7].However, most of these tools examine overall systems performance rather than the source documents such as registers [1].
The Tanzania RHIS context: Tanzania is using the paper-based registers to collect process and analyse, and transmit patient's routine health information data, from the health facilities up to the district.The RHIS comprises of over ten sets of registers and 70 processing and reporting forms [8,9].Each clinician/nurse records client consultation into the registers and simultaneously counts each diagnosis using the tally sheets.At the end every month the data is summarized into the secondary data forms for reporting purposes to the health management team at the district, the region and the ministry of health.At the district level, the DHIS2 focal person uploads the monthly reports into opensource software.
We embarked in the assessment of the quality of data collected using individual health records contained in the health facility's paper-based registers by examining four determinant of data quality [5] The determinants included completeness of data elements, the timeliness of report submission, and the accuracy of the collected data [1,5,7,10].This commentary re-examines some of the methodological challenges of conducting the assessment of data quality and lessons learned during the design and the execution of the study in the poor resource settings.

Methods
The study design was a retrospective case series, analysing children aged below five years, also referred to as under five outpatient registers.The authors included in this study all registers completed from October 2013 March 2014 from 24% (10/42) of the health facilities of Ilemela municipality, Mwanza region.The authors developed the study design under the assumptions that the registers may have missing individual records, the health workers may not submit the monthly reports on time and that there might be a discrepancy between the actual registers counts of diseases of interest and the monthly reports submitted to the health management team.The detailed methods, which forms basis for this commentary is available from Kabakama, 2016 [5].

Results and Discussion
The main challenges of the study methods used is that it is timeconsuming and costly mainly due to transcribing and decoding of the patient's records into the meaningful electronic software for easy analysis, and interpretation of the quality determinates.Prospective studies intending using this method could minimize the costs and improve the representativeness of the results by reducing the number of months and the number of registers involved while increasing the number of health facilities and districts participating in the study.

Health facilities inclusion criteria
The first step was to exclude 15/42 (36%) of the health facilities, which were not consistently reporting over the past six months.The second step was to ensure that at least 20% of the total facilities in the municipality are included in the study.The fact that we included the only consistently reporting health facilities could have potentially introduced the design bias.Arguably, the purpose was to assess the accuracy and timeliness in reporting.Hence the study design was appropriate.The shortcoming of the approach is that we might have excluded the poorly performing health facilities from the assessment.Besides, there is enough no experience on what proportion of the health services outlets should be included in the data quality assessment.

Register's records transcription and decoding
Transcription and decoding was the most difficult part of the study.The data collection process entailed going through the completed fewer than five outpatients records for six consecutive months.A facility with the highest number of consultations had 4-5 completed and archived registers in one month.Although data quality begins with accurate completion of source documents such as patient's records, going back to the registers is a not only a cumbersome process but also a time consuming and costly exercise [1,7,11].Another challenge was the amount of time spent in decoding the diagnoses into to 7 categories of interest to enable comparison between the counts in the registers and the monthly reports.This process was necessary because clinicians classified diseases based on number guidelines, mostly syndrome.after the initial transcription of the records, just tabulation of missing data and comparison of the registers counts with the monthly reports.The simplified version of the data quality assessment is counting from the summary tables, the proportion of reporting health facilities and the total reports submitted against the expected [12][13][14][15][16].However, these methods leave out the recording process, which is an essential step toward quality assessment.
The accuracy determination in this study was determining the variance between the reports and the source documents.The authors did not as certain if the registers measured what was intended to measure [7,10].The shortcoming of the case series method is that we are unable to understand if the discrepancy observed was as a result of intentional falsification or unintentional because of lack of knowledge or skills.The study method had little interaction between the health providers and the researchers.It would be ideal, however, when measuring the accuracy of the record to have two observers interviewing the same patient and filling in the patient's log and comparing the findings.This technique of measuring accuracy the study design will no longer be retrospective case series.However, it will equally be expensive as it requires a second observer as a gold standard.

Lessons learned
The study design lesson learned is that the duration of the reporting period could be reduced from six months to one or two months, allowing more municipalities (districts) and more health facilities to be included in the study.These sampling techniques could also improve the study results representativeness.Data quality assurance (DQA) techniques have also suggested this spread and sampling some records rather complete enumeration of the registers [7].Sampling within registers could facilitate the revision of piles of patients' logs.The scope of data quality should consider time, human and financial resources.
The lesson from classification of diseases.If the nomenclatures contained in numerous guidelines are not harmonized, it would be difficult for studies of the similar design to compare between diseases and among health facilities.Researchers may be required to use research assistant with clinical background with also prior training on methods used to classify diseases in that context.If the assessment is for administrative purposes, it would important for the clinician to adhere to recommended ICD 10 classification, as contained in the MTUHA guideline.Harmonization of classification would be the first step towards assessment RHIS data quality performance [9].Other studies as shown that where the classification of diseases is harmonized the quality of reporting morbidity data improved [12,13].
In the resource-poor settings there are few options for dissemination.The authors presented the study results at only one university symposium, and the duration from the manuscript submission to publication was somewhat faster.Opportunities to disseminate and use of research finding in the resources-limited settings are limited and usually underfunded.

Conclusion
This commentary presents methodological challenges and lessons of assessing the RHIS data quality using individual record and registers in the resources limited settings.Authors present issues one is likely to encounters and suggests possible solutions.The commentary may complement existing RHIS assessment tools and contribute to the existing knowledge in the literature, both published and grey.