Development of comparable algorithms to measure primary care indicators using administrative health data across three Canadian provinces

Abstract Introduction Performance measurement has been recognized as key to transforming primary care (PC). Yet, performance reporting in PC lags behind even though high-performing PC is foundational to an effective and efficient health care system. Objectives We used administrative data from three Canadian provinces, British Columbia, Ontario and Nova Scotia, to: 1) identify and develop a core set of PC performance indicators using administrative data and 2) examine their ability to capture PC performance. Methods Administrative data used included Physician Billings, Discharge Abstract Database, the National Ambulatory Care and Reporting System database, Census and Vital Statistics. Indicators were compiled based on a literature review of PC indicators previously developed with administrative data available in Canada (n=158). We engaged in iterative discussions to assess data conformity, completeness, and plausibility of results in all jurisdictions. Challenges to creating comparable algorithms were examined through content analysis and research team discussions, which included clinicians, analysts, and health services researchers familiar with PC. Results Our final list included 21 PC performance indicators pertaining to 1) technical care (n=4), 2) continuity of care (n=6), and 3) health services utilization (n=11). Establishing comparable algorithms across provinces was possible though time intensive. A major challenge was inconsistent data elements. Ease of data access, and a deep understanding of the data and practice context, was essential for selecting the most appropriate data elements. Conclusions This project is unique in creating algorithms to measure PC performance across provinces. It was essential to balance internal validity of the indicators within a province and external validity across provinces. The intuitive desire of having the exact same coding across provinces was infeasible due to lack of standardized PC data. Rather, a context-tailored definition was developed for each jurisdiction. This work serves as an example for developing comparable PC performance indicators across different provincial/territorial jurisdictions.


Introduction
Reporting on performance can influence quality improvement agendas and improve performance [1]. Past work shows that public reporting may improve performance, [2][3][4][5][6] as it has the potential to "improve the quality of care, increase account-ability, facilitate public participation in health care," [7][8][9] impact societal and professional values, and direct attention to issues not currently on the policy agenda [7,10,11]. It may also facilitate collaboration among stakeholders as they set a common agenda [12]. Performance reporting in the hospital sector continues to grow. Yet, performance reporting in pri-mary care (PC) lags behind even though high-performing PC is widely recognized as foundational to an effective and efficient health care system. Countries with strong PC systems have lower mortality rates and overall costs as well as better health outcomes and health equity [13][14][15].
There are growing demands for performance reporting in PC from many stakeholders including patients [16,17]. Performance measurement has been recognized as key to transforming PC [18][19][20]. This includes comparison of performance against both internal and external standards, and identifying opportunities for improvement [21]. However, PC performance reporting is challenging because of the dearth of concise and synthesized information, and because many clinicians prefer to be accountable only for their individual role and do not view themselves as actors within a larger system [22].
There are examples of national public reporting of PC performance in other countries but only limited efforts in Canada. International examples include BEACH in Australia [5,23], National Ambulatory Medical Care Survey (NAMCS) in the USA [18] and the Quality and Outcomes Framework in the UK [24,25]. There has been some provincial PC reporting by provincial Health Quality Councils [26,27]. The only significant national effort in Canada was the joint Canadian Institute for Health Information (CIHI)/Health Council report of a 2008 population survey [19]. The most commonly referenced performance information about PC in Canada is from the Commonwealth Fund patient and clinician surveys in industrialized nations [2,[28][29][30][31][32]. The surveys are based on samples of 1000 patients or clinicians per country in independent surveys, and show that PC performance in Canada is poor compared to other countries. These disappointing results have helped put Community-Based Primary Health Care on Canada's policy radar. Yet, the Commonwealth Fund surveys have limitations. Notably the small sample size does not permit meaningful analysis at the regional level where policy decisions are often made.
The creation of comparable information from PC data across jurisdictions remains nascent. Comparable information within and across jurisdictions is needed to ultimately influence health and healthcare outcomes. Moreover, this task is relevant for all learning healthcare systems, [33] particularly in federated jurisdictions like Canada, where there are multiple parallel provincial and territorial single-payer systems [34]. Because the fundamentals and objectives of PC are universal, [35] the transfer of learning experiences from one jurisdiction to another should be facilitated, especially within one country. This undertaking includes creating and maintaining high quality data as well as work towards gaining meaning from existing healthcare data within integrated healthcare systems.
Measuring PC requires multiple sources of information, including data from patients, clinicians, charts and administrative data [36]. While comprehensive PC information and measurement systems are still being built, one obvious place to start is using health administrative data since it is already routinely collected. These data are longitudinal and can provide actionable information about healthcare services [37][38][39]. For example, Health Quality Ontario uses administrative data to provide PC clinicians with reports on performance of care and health service utilization particular to their practices' patient panels [40]. The purpose of this work was to 1) identify and develop a core set of indicators of PC performance using ad-ministrative health data and 2) examine their ability to capture PC performance. This paper reports on the processes of doing this work, including the infrastructure and resources needed to support it, noting challenges and examples of how to promote similarity when working with administrative data from multiple jurisdictions. While this work relied only on Canadian data from three separately funded and managed provincial health systems, it holds lessons for other efforts to compare data across health systems.

Methods
This work took place as part of the TRANSFORMATION study, which set out to improve the science and reporting of PC performance. TRANSFORMATION is a cross-sectional, multi-site research program [41] in Canada that used multiple sources of data (patient, provider and organizational surveys, administrative data, case studies, and deliberative dialogues with patients and clinicians) to produce comprehensive regional-level PC performance reports. The study sites, Fraser East, British Columbia (BC), Eastern Ontario Health Unit, Ontario (ON), and Central Zone, Nova Scotia (NS) based on the willingness of the clinicians and decision-makers in these areas to participate. Herein, we describe our methodology for developing PC reporting using the administrative data.
Administrative Data source. Canada's 13 provincial/territorial (P/T) governments are responsible for delivery and organization of healthcare where the government is the insurer and administers its own version of a healthcare plan [21]. Healthcare is paid for through federal government transfers and P/T public tax revenues [42]. In terms of PC, most of it is delivered through family physicians who work in private practices, essentially small businesses [43]. These businesses generate data because they bill for services. The single largest source of PC administrative data is provincial billing data. In addition, specific provincial data sources were accessed and used for this study relating to health coverage registration and services delivery (Table 1).
Several administrative datasets were common across all three provinces (Table 1), including the Discharge Abstract Database (DAD) and -to a lesser degree -the National Ambulatory Care and Reporting System Metadata (NACRS) database [44,45]. While DAD coverage and data elements were consistent across all provinces, differences in NACRS were present. In ON, detailed data pertaining to emergency department visits and other outpatient services (such as day surgeries) from all facilities were collected. For the other two provinces, emergency department visits were only in NACRS from some facilities; these data were combined with physician billings to capture as many emergency department visits as possible. Further, while day surgery was available in NACRS for NS, it was only available in DAD for BC.
All jurisdictional data can be linked within province using a unique identifier. That is, the linkage of datasets can be completed by the data hosting agency and the research team can access a de-identified linked dataset. However, these linked jurisdictional data can only be analyzed within each province. Population Data BC [51], ICES (formerly the Institute for Clinical Evaluative Sciences) in ON [52], and Health Data NS [53] provide their province's secure research environment that is Although these organizations have similar purposes to protect patient confidentiality, significant differences in terms of experience and available resources are also observed among them [51]. Specifically, ICES has considerable funding and capacity for work with administrative databases not available to the same extent in BC or NS. These differences led to additional complexities in our ability to analyze the data.
PC performance indicators. Indicators of PC performance were compiled based on a literature review of PC indicators previously developed with administrative data available in Canada (n=158), [1,[3][4][5][6][7]9,10,52] including reviewing the previously developed CIHI indicators [53][54][55][56][57][58][59][60][61][62] (see supplemental material: S_Table 1). For each indicator, we ascertained if the requisite administrative data were available and whether there was a pre-existing algorithm being used to measure a similar construct or whether the indicator needed to be developed for this study specifically. The research team shortened the list using the following inclusion criteria: 1) indicator measures PC performance and 2) data used to construct the indicator were available in at least two out of the three provinces -this was reduced from three due to so few indicators being possible to compute in all three provinces. To the extent possible, we constructed the cross-provincial list of indicators from each province's data elements so that they would produce comparable algorithms to measure of PC performance.
Mapping PC performance indictors to a theoretical framework. In order to ensure that identified indicators target key theoretical aspects of PC performance and to assess which of these theoretical domains are possible to address with administrative data [63], they were mapped to the Hogg et al. (2008) framework [37]. This PC Performance Measurement Framework, which includes core PC performance domains, was chosen as it provides a comprehensive view of PC performance [36,37,64,65]. Indicators fitting into this framework demonstrates a high potential to help improve patient care.
Analysis to assess algorithm comparability of PC indicator. Knowlton et al.'s (2017) framework pertaining to aligning data from multiple sources was used to assess the province's algorithms used to create the performance indicators [66]. Specifications of each candidate PC indicator was defined. We engaged in iterative discussions to assess data conformity, completeness, and plausibility of the results pertaining to PC performance indicators in all jurisdictions. We maintained a file of detailed documentation and a spreadsheet of the decisions, a process known as 'data curation' [67].
The documentation was used to record the process of establishing the comparable algorithms for PC performance. The documentation included definitions of what was recorded in the datasets, inclusion and exclusion criteria such as age and sex restriction to create a cohort for each indicator, name(s) and type(s) of the variables used in the denominator and numerator of performance indicator calculations, and any additional information needed to understand each indicator. Notes and comments also documented whether indicators are routinely computed for ongoing performance measurement, such as indicators in the ICES Primary Care Population cohort (PCPOP) that are used for Ontario's performance reports to family physicians or if indicators required new or adapted algorithms.
Comparable algorithms were assessed by examining the data source and data elements used to construct the indicator in each province, any modifications made to the original measurement, ease of adaptability for each province to achieve a final version of the indicator, and the practice context within each province. Challenges in examining and creating comparable algorithms was examined through content analysis of the notes and working with the research team, which included clinicians, analysts, and health services researchers familiar with PC.

Results
PC performance indicators. We initially assessed a total of 168 potential indicators. There were 158 indicators identified through the literature review (see supplementary material). After removing duplicates (n=69), an additional 67 indicators were removed because they did not meet inclusion criteria. An additional seven indicators were removed because their construction would either require excessive time to create a suitable version of the indicator or a comparable definition was not possible to operationalize across data sources available in the provinces (Figure 1). For example, colon cancer screening occurs differently across the three provinces. In BC it is performed within PC; in NS, it is a government run service outside of PC; and in ON, it is a government organized program, where PC is incentivised and encouraged to perform, and the results are centrally monitored. Accordingly, colon cancer screening would not be an appropriate PC performance indicator in NS and ON, as it is not performed by PC in NS and is organized by the government in ON.
A total of 21 indicators were included in our study (Table 2). These indicators can be categorized as: 1) technical care (n=4; cervical cancer and osteoporosis screening, diabetes management, and use of metformin; 2) continuity of care (n=6; continuity of care index, usual provider of care index, modified continuity index, continuity with family physicians, mental health continuity, and multiple conditions continuity); and 3) health services utilization (n=11; emergency department visits, hospital readmissions, serious diabetes complication, mental health readmission, home visits in end-of-life, ambulatory care sensitive admissions, primary care costs per patient, total costs per patient, physicians prescribing medications per patient, number of different medications, and total medication costs per patient aged 65 and older) as per the Hogg et al framework [68].
Comparable algorithms of PC performance. Eighteen out of the 21 indicators could be constructed in all three provinces ( Table 2). Comparable algorithms were easiest to achieve when common datasets across provinces were used (i.e. prepared according to a common set of standards as set by CIHI). The analysts in each jurisdiction ensured that the data could be maximized into comparable formats.
Challenges in establishing comparable algorithms. A major challenge in establishing comparable algorithms was the significant differences in the data elements. Ease of access to the data, in addition to a deep understanding of the data and practice context was essential to decide upon the most Indicators assessed for potential to obtain comparable definitions (n = 28) Comparable definitions for indicator assessed to be infeasible (n = 7) Indicators included in analysis (n = 21)

BC and NS
Ambulatory care sensitive admissions: Number of non-elective hospital admissions for each of asthma, COPD, diabetes, and CHF, among people diagnosed with those conditions. In each cohort, rates were per 1,000 people.

All three
Primary care costs per patient: Mean ambulatory primary care costs per capita over a 1-year observation period

All three
Total costs per patient: Mean total healthcare costs (excluding prescription medication) per capita over a 1-year observation period and mean total healthcare costs (including prescription medication) per patient aged 65+ over a 1-year observation period.

All three
Physicians Prescribing per patient: Mean number of unique GPs prescribing medications per capita aged 65+, in a one-year period.

Number of different medications:
Mean # of different medications per capita (from all prescribers, and, separately, only from GP prescribers) aged 65+, as measured at the ATC 4th level (chemical/therapeutic/pharmacological subgroup)

All three
Total Medication Costs per capita aged 65+: Mean (and median) total Government prescription medication costs per capita (from all prescribers, and, separately, only from GP prescribers) aged 65+ over a 1-year observation period.
All three appropriate data elements within the data sources. Ontario's critical mass of analytic and clinical capacity developed over a number of years versus BC and NS "younger" abilities to use these data added to the complexity of cross-jurisdictional work in PC. These single-center structures have heterogeneous data access request requirements, varying funding arrangements for internal analytic capacity and mission statements that drive the ability to create and refine a library of data programs. For example, ON researchers can conduct multiple analyses for a single dataset, although privacy and organizational approval must be attained and dataset requirements followed for each analysis. Researchers in BC and NS can only conduct analyses that have been approved, while also requiring an amendment or new data access request. Moreover, sustained provincial government funding and building analytic capacity within ICES has meant ON has dedicated significantly more time and resources fine-tuning their specialized databases (e.g. Ontario Diabetes Database) and algorithms (e.g. series of codes used to identify cancer screening indicators) to increase precision of indicator estimates.
It is current practice for ICES to publicly report on PC performance indicators whereas this investment in healthcare system reporting is currently not seen in BC or NS. When developing some of our indicators, ON already had an algorithm where the team discussed the merits of modifying or adapting to BC or NS. For example, the denominator for the 'Diabetes Management' indicator should include all people meeting the overall study criteria who have diabetes. Ontario was the only province that had a pre-existing Diabetes Database containing information on people who fulfill a validated algorithm for diabetes diagnosis. This algorithm identified diabetes cases based on physician claims, both diagnostic and service codes, and hospital admissions over a two-year period [69,70]. This pre-existing validated algorithm provides ON with a more accurate denominator. For many indicators, there was much discussion amongst the study team about how much precision was needed within the algorithm, whether ON's previously developed algorithm would be comparable with measures derived from the other provinces, and whether a modified version was needed to achieve comparability with BC and NS. In this case, BC already had a similar pre-existing definition for identifying patients with diabetes but no separate database. Nova Scotia also had these data available but also did not have a separate database. Therefore, the same denominator identification algorithm was adopted by all three provinces, restricting to years of observation available for all three.
While the above represents a case where all provinces could agree upon and arrive at the same definition, in other cases discussion revealed that some pre-existing algorithms were meaningful for ON's goals but conceptually different from what would be a meaningful indicator of performance in other provinces. For example, for indicators pertaining to continuity and usual provider ON would prefer to only use data where the patient had been rostered to a provider, but BC and NS do not roster; results would be skewed if ON only used rostered patients. In other words, if each province conceptualized an ideal performance indicator regardless of data availability, there may be discrepancies. Comparability in this situation is more about values and context than data quality or completeness.
Differences in data granularity -the level of available details in the dataset -were also challenges in establishing com-parable algorithms. We needed to balance decreased precision by reducing the level of details available from one province to gain comparability with the level of details available in other provinces. For example, to define the indicator relating to osteoporosis screening, we used physician billings to identify bone mineral density testing. While all provinces could use fee codes, ON had more options available. To make a comparable indicator, ON aggregated several codes (those indicating baseline and follow-up and those specifying the body site being tested) because there were fewer options for billing codes in other provinces [71]. Similar procedures were performed on other indicator algorithms across the three provinces.

Discussion
The number of indicators that could be developed using administrative data to understand PC performance across Canada is limited to technical quality of care, continuity of care, and health service utilization. While there may be other PC indicators available within one province, they are not useful for examining PC across Canada. Developing algorithms for performance indicators using administrative data across jurisdictions remains time intensive and completed by few in Canada. Work has been completed in areas of cancer care [72,73] and palliative care [74,75]. This is the first project creating algorithms to measure PC indicators across BC, ON, and NS using administrative data. Differences in resources for working with administrative data were most profound between ON and the other two provinces (NS and BC).
Using technology and a 'living document' that analysts and staff could maintain were key success factors for this work. Platforms that enable research team members to continuously edit the information and add / remove details to facilitate the discussion are essential. Our work showed that comparable algorithms and similar indicators across provinces using administrative data is possible and can become easier by building on work already completed [76,77]. The process of creating comparable performance indicators promoted mutual learning from pre-existing approaches in each province and also promoted updates to existing approaches based on other identified models.
While administrative data do not cover many important aspects of PC performance, there are advantages to using comparable administrative data indicators to assess elements of PC performance. These data are inexpensive because they are already collected. The process of achieving comparable algorithms may lead to some increases in measurement error, though it may be better to have more meaningful statistical comparisons at the cost of slightly reduced internal validity [78]. It is essential to address the tension between internal validity of the indicators within a province and external validity of the algorithms across provinces. The intuitive desire of having the exact same coding across provinces was challenging and required compromise to achieve. So, a context-tailored definition was developed for each jurisdiction with individuals knowledgeable about each health system and agreement on the final algorithms. We adapted existing algorithms and generated new ones to create a suite of algorithms that could be reasonably applied across jurisdictions. For comparisons to be possible, each provincial analyst agreed to: 1) discuss how to define comparable indicators, 2) calculate the indicators using their own provincial data, and 3) share results.
Primary care planning, resource allocation and improving quality both at individual practice level and at the healthcare system level require accurate measurement of PC performance [79]. Importantly, country-wide comparable indicators are essential for learning health systems [33] as they allow ongoing comparisons or assessments of performance. The scarcity of data to measure PC performance hampers decision-maker and clinicians' abilities to strengthen it. Hopefully initiatives such as the Strategy for Patient Oriented Research Canadian Data Platform can contribute to across province/territory administrative data algorithm development for PC [80]. Additionally, increasing the use and improving the quality of electronic medical records and patient experience surveys can facilitate the availability of robust data to fill the measurement gap [81]. There is high potential of PC data and performance measurement to be advanced through these data sources, and through other platforms and activities.

Limitations
Our work was limited to using administrative data across three provinces, BC, ON, and NS. These data are limited to community-dwelling residents and what was possible to measure using the available data. It does not include members of the military or those living on-reserve as their health care services are captured in federal databases. Similarly, the work of many important non-physician providers such as nurse practitioners is not captured. Variable service fee codes, inconsistencies in physician billing practices, and different service definitions are known challenges for cross-provincial initiatives to measure PC performance using administrative data [31,32]. For example, one jurisdiction may provide an after-hours bonus starting at midnight, while another's starts at 8 p.m. Moreover, contextual differences add challenges for inter-provincial comparisons, particularly as family physicians may provide the same services in a variety of settings from community-based clinics, to emergency rooms, to long-term care and hospital in-patient wards. For example, there may be differences in resulting records if one province compensates physicians for a certain service such as hypertension management when performed in a rural emergency department, and another only if that service is provided in an outpatient clinic. Hence, the extent to which ambulatory care records capture PC services received by any given patient will vary depending on 1) where they received services and 2) how their utilization pattern is recorded in their province.

Conclusion
Arriving at comparable administrative data definitions across provinces is essential to enhance the performance of PC. This task is challenging and time consuming. However, this study provides some foundational work towards establishing PC performance measurements in an inter-jurisdictional manner [11][12][13]. We established 21 indicators using a variety of administrative data sources. We highlighted both challenges and strengths of our approach, where there are differences in data structure and content and data pooling is not a solution.
This work can be used as an approach to develop comparable algorithms of PC performance in health systems comprising of different jurisdictions. In British Columbia, data was made accessible by Population Data BC. All inferences, opinions, and conclusions drawn in this article are those of the authors, and do not reflect the opinions or policies of the Data Steward(s).
This study was supported by ICES, which is funded by an annual grant from the Ontario Ministry of Health and Long-Term Care (MOHLTC). Parts of this material are based on data and information compiled and provided by MOHLTC, Canadian Cancer Organization and the Canadian Institute for Health Information. The analyses, conclusions, opinions and statements expressed herein are solely those of the authors and do not reflect those of the funding or data sources; no endorsement is intended or should be inferred. Parts of this material are based on data and/or information from the Canadian Drug Product Database and Data Extract, compiled and provided by Health Canada, and used by ICES with the permission of the Minister of Health Canada, 2017. https://www.canada.ca/en/healthcanada/services/drugs-health-products/drug-products/drugproduct-database.html. We thank IMS Brogan Inc. for use of their Drug Information Database.
The data (or portions of the data) used in this report were made available by Health Data Nova Scotia of Dalhousie University. Although this research is based on data obtained from the Nova Scotia Department of Health and Wellness, the observations and opinions expressed are those of the authors and do not represent those of either Health Data Nova Scotia or the Department of Health and Wellness.

Statement on conflicts of interest
The authors have no conflicts of interest to report.

Ethics statement
Transforming CBPHC delivery through comprehensive performance measurement and reporting (TRANSFORMATION)