Evaluation of the relevance and access of EHR-based variables to support personalized medicine in breast cancer

Abstract Background: The increasing number of available cancer therapies render medical decision-making (MDM) less straightforward. Patients want to know about the outcomes of similarly treated patients. Objective: The goal of this study was to design a breast cancer dashboard (BCD) tool that presents survival information to support MDM activities. Methods: Clinical variables during the clinic visit were determined via provider meetings and evaluated for accessibility from medical databases. Women with breast cancer (BC) were interviewed about their health care experiences after cancer diagnosis. We created a cohort of BC adult women treated at our institution from 1995 to 2012, from which clinical scenarios were defined and used to test survival outcomes. For the BCD, a simple, graphical user interface was built to present point-of-care clinical and survival data. Results: It is feasible to build the BCD using our institution’s databases and generate survival plots to facilitate MDM activities. Patients with early-stage BC had the highest survival rate (82.3%) and the longest mean life years of 7.0 (SD 4.5) years. In late-stage BC, poor prognosis outweighs the influence of number of comorbidities on mortality. The BCD tool promotes more predictive, personalized, and collaborative health care.


PUBLIC INTEREST STATEMENT
As more breast cancer treatment options become available, the shared medical decision-making relationship between physician and patient is becoming less straightforward. To support this important interaction, we have created an institution-specific medical communication tool. Decision aids have been shown to improve the health care experience of patients and ultimately lead to the achievement of higher-quality decisions. Our communication tool, named the breast cancer dashboard (BCD), was designed to be used during the clinic visit. It specifically address patients' needs to know about the survival outcomes of similar patients treated at our cancer specialty hospital. The BCD brings together comprehensive and breast cancerspecific clinical information. It allows both the provider and patient to navigate the information using a patient-friendly graphic interface. By enhancing shared decision-making, there would be a shift toward patient care that is more predictive, preventive, personalized, and collaborative.

Introduction
Breast cancer is the most common cancer and second leading cause of cancer death among women in the United States (US; SEER Stat Fact Sheets: Breast Cancer, 2015). Recent trends portray a decline in breast cancer-related deaths; however, the death rate has not significantly changed since 2002 (SEER Stat Fact Sheets: Breast Cancer, 2015). At the same time, costs associated with breast cancer remain considerable. The mean lifetime cost of metastatic breast cancer is $110,000 (Taylor et al., 2011). It is estimated that the US annually spends over $1.2 billion on over-diagnosis and unnecessary treatment of indolentbreast tumors (Ong & Mandl, 2015).
As more therapies for breast cancer become available, medical decision-making (MDM) related to treatment, management, and cost-to-patient grows in complexity (Green, Biesecker, McInerney, Mauger, & Fost, 2001;Jibaja-Weiss et al., 2011;Leung, Kvizhinadze, Nair, & Blakely, 2016;Senkus & Łacko, 2016;Siminoff, Ravdin, Colabianchi, & Sturm, 2000;Street, Voigt, Geyer, Manning, & Swanson, 1995). MDM is a multi-faceted process, particularly in oncology where decisions about type and duration of treatment are not straightforward. MDM is driven by patient characteristics and preferences, disease and treatment history, actionable biomarkers and/or those with prognostic or predictive value, benefit-risk profile of therapies, and treatment-related costs. Since not all patients respond to the same treatment equally, personalized medicine is an important component of cancer care (Cho, Jeon, & Kim, 2012;Weldon, Trosman, Gradishar, Benson, & Schink, 2012). Studies have demonstrated that women with estrogen receptor positive breast cancer, an independent predictor of outcome, have better survival (Grann et al., 2005). Patients with HER2/neu overexpression derive survival benefit from treatment with trastuzumab along with adjuvant chemotherapy (Smith et al., 2007). Medical treatment optimally tailored to the patient maximizes clinical benefit, avoids unnecessary drug exposure, and helps to curb treatment costs that do not deliver valued outcomes to patients.
Given the complexity of MDM and the ultimate goal to provide the right treatment at the right time to the right patient at the right cost, clinical decision tools have been developed to facilitate discussions between health care professionals and patients. Patients who use decision aids have improved knowledge about disease and treatment (Ravdin et al., 2001;Stacey et al., 2014;Whelan et al., 2003) and reduced feelings of uncertainty and passivity compared with usual care. Decision aids have been shown to be effective in people with lower levels of education (Evans et al., 2010;Trevena, Irwig, & Barratt, 2008). Jones et al. (2009) reported that there is greater benefit from a decision aid delivered by clinicians during a clinic visit compared to prior to a visit. Decision tools have been specifically studied in decision-making in oncology (Stacey, Samant, & Bennett, 2008). Results indicate that patients exposed to decision aids are more likely to be active participants in MDM and achieve higher-quality decisions (Stacey et al., 2008).
We have observed that patients at our institution who are considering treatment options want to know about the outcomes of similarly treated patients who have been treated at our institution. Although we use standard clinical decision tools, such as Adjuvant! Online, to provide national and evidence-based statistics, we are limited in our ability to provide local, real-time information to our patients. Typically, these clinical decision tools provide information drawn from large epidemiological and medical databases, clinical trials, and health care literature. To address this gap, we have designed a clinical decision tool henceforth referred to as a point-of-care breast cancer dashboard (BCD). The BCD draws upon our institution-specific electronic health record (EHR) to provide userfriendly, patient-oriented, aggregated information about survival and prognosis.
In addition to providing local, real-time information to facilitate MDM activities, the BCD enables purposeful, selective filtering of breast cancer-specific patient information in a time-sensitive manner. As with many large hospital systems, our institution's EHR is a clinical database designed for all disease states. Although a broadly designed database allows for flexibility of use, it can also suffer from excess information or inefficiency of locating information. For example, finding medical data that is particularly relevant to breast cancer, such as hormone receptor status or BRCA mutation status, can be time consuming. To find this information, a health care provider may need to mouseclick through various parts of a patient's electronic health record and read several types of documentation, such as clinical notes or reports of genetic analysis. One of the strengths of the BCD is its ability to capture and present MDM-pertinent information that is easy to interpret and understand, simple to navigate for both health care provider and patient, and efficient in its application. The BCD can display clinically relevant variables "at a glance" and also objectively display survival plots built from specific patient characteristics.
The goal of this pilot study is to evaluate the relevance of clinical variables in breast cancer treatment, and the accessibility of these clinical variables within research databases towards creating the point-of-care BCD. We designed the BCD with the intention that it be used primarily by the breast cancer clinician as a shared decision-making tool to inform communication between the patient and provider. A recent Cochrane review reported that decision aids are superior to standard of care interventions (Stacey et al., 2014). The patient-centric facet of the BCD was necessary to enhance the patient's comprehension as the provider discussed breast cancer treatment with the help of the BCD. This work incorporated the following: (1) determination of the key clinical data for inclusion in the BCD; (2) evaluation of the accessibility of physician-indicated data from medical databases for inclusion in the BCD; (3) creation of a patient cohort grouped by clinical scenarios; (4) testing of survival outcomes using clinical scenarios; (5) building a simple, graphical user interface to collect key elements at the point-of-care for MDM; and (6) understanding patients' perspectives to make the BCD meaningful and useful.

Methods
To determine key clinical data variables for inclusion in the BCD, one dedicated focus group was conducted involving 10 representative breast cancer providers across various specialties (medical oncology, surgical oncology, radiation oncology, pathology, nurse practitioners, clinical coordinators) practicing at the Huntsman Cancer Institute (HCI). Using the information gathered at this focus group, an electronic questionnaire was designed to evaluate clinical relevance and determine degree of clinical utility of the clinical variables identified.. This questionnaire was then disseminated to a larger group of similarly representative breast cancer clinicians at HCI. Survey participants were asked to determine the clinical value of certain medical data utilized at point-of-care by applying a 3-category scale: "must have," "would like to have", or "unnecessary". Subsequently, we utilized data querying, text searching, and electronic chart review to evaluate the accessibility of these clinical variables in two databases: University of Utah Enterprise Data Warehouse (EDW), and HCI Tumor Registry combined with Utah Central Cancer Registry Data (together referred to as UCCR). The accessibility of clinical variables was determined to be "available" (immediate), "difficult to access" (required secondary level searching), or "unavailable". Clinical variables were categorized as "available" if they were successfully abstracted from the databases, "difficult to access" if they required chart review, and "unavailable" if they could not be found in either the EDW or UCCR.
In order to better understand patients' perspectives, we conducted a patient focus group composed of four patient research partners who were at various stages of survivorship. A focus group with a greater number of patients was not performed, since the BCD was designed to be primarily navigated and manipulated by breast cancer providers. In the real world setting, the provider would utilize the BCD during the clinic encounter, capitalizing on its patient-oriented format. Therefore, the input of the patient as a stakeholder in this patient-provider partnership was needed to ensure that the BCD would adequately address patients' needs. To capture the variation in breast cancer patients, the patients in the focus group each had unique personal and disease factors. The patients were recruited through personal contacts and via the physicians in the study. After corresponding by email, we scheduled an in-person meeting. Patients were invited to talk about their health care experiences after they were diagnosed with early stage breast cancer. They expressed their fears, concerns, and need for evidence-based information on issues that were relevant to them. The next step in building the BCD was creating a breast cancer cohort. Data was extracted from the University of Utah Enterprise Data Warehouse (EDW) and the Utah Population Database (UPDB). The EDW served as the core database, and UPDB was an adjunct data source. The EDW is the longterm data mart for patient medical, financial, and administrative data that is managed and maintained by the University of Utah Health Science Data Resource Center. The EDW integrates the historical and comprehensive clinical patient records across the University of Utah Healthcare System and HCI. The UPDB is a unique database that supports research for genetics, epidemiology, demography, and public health. It is a rich source of information for diagnostic records about cancer, cause of death, and claims data on more than 7.3 million individuals.
Study subjects included breast cancer patients aged 18 years or older identified with ICD-9-CM codes (174.0-174.9) who were treated at HCI from 1 January 1995 to 31 December 2012. Patients were excluded from the cohort if they lacked records in the HCI Tumor Registry, which included breast cancer staging information. The index date was defined as the date of initial diagnosis of breast cancer. Patient information was collected from the EDW for the following data: demographics; health care plan; comorbidities expressed as the Charlson Comorbidity Index (CCI); and tumor characteristics, such as clinical stage, histologic grade, laterality, Tumor, Node, Metastasis Classification of Malignant Tumors staging (TNM), site of metastases, number of metastases. Death information was abstracted from death certificates obtained from the UPDB. Alternatively, death registration information was obtained from the EDW if not found in UPDB, or by vital status in the HCI Tumor Registry.
The primary outcome of the survival analysis was time to death from diagnosis of breast cancer. Survival was calculated as the number of years from index date to date of death or end of the study period, whichever occurred first. To develop the reporting of survival outcomes using aggregated data, we grouped cohort patients by stage of disease of breast cancer. Within each stage, we applied descriptive statistics to patient demographics, CCI, and tumor specific characteristics. Descriptive statistics were compared among the different disease stages. For the whole cohort population, Cox regression models were built to predict survival by age, CCI, and clinical stage. For sub-analyses by clinical stage, survival by age and CCI as predictor variables were used. Survival was compared among patients at different cancer stages and clinical scenarios via Kaplan-Meier analysis. For a given stage, clinical scenarios varied by age and CCI. For each clinical scenario, a 95% confidence band was included for the respective survival curve to show the range of proportion of patients who could survive at a given time point. The prototype BCD was built as a webpage using HTML and the JavaScript library D3 (www.d3js.org). To generate survival curves, time to death was stratified by age and stage variables. Other desirable variables were grayed-out with input fields removed to allow provider users to get an idea of scalability while maintaining functionality using a small data-set. Python (www.python.org) was used to create tab-delimited files containing survival curve plot data for each combination of variables in a defined clinical scenario. To ensure a sufficient number of data points for plot curve generation was available, variables were categorized. Age groups were defined as 18-64, 65-74, or ≥ 75 years. Breast cancer disease stage groups were defined as stage I, II, III, and IV. Plot points for the Kaplan-Meier estimate were created using the product limit estimate of where t i is duration of study, d i is number of deaths and n i is number of individuals at risk at time i. Pictograms (www.iconarray.com/) were used to indicate the survival possibility at the end of 2-year, 5-year, and 10-year after the diagnosis of breast cancer at a specific clinical stage. A simple JavaScript algorithm was used to match user input with the correct Kaplan-Meier estimate and render the survival plots.
The design and execution of the study was performed by the listed authors. This study was approved by the University of Utah Institutional Review Board.

Results
HCI breast cancer providers participated in the focus group to determine the clinical variables for inclusion in the BCD. Participants included two medical oncologists, three surgical oncologists, two radiation oncologists, one pathologist, one oncology patient coordinator, one clinical coordinator, and several patient assistants and nurse practitioners. The survey derived from the focus group was sent to the remaining breast cancer providers at HCI. There was 76% response rate (19 of 25 providers) for the questionnaire asking about the degree of clinical utility of the specified clinical variables. Respondents' mean age was 45.8 years (SD 9.2, range 33 to 62) and the mean years of experience in providing breast cancer care was 9.9 years (SD 7.6, range 2-25). The majority of questionnaire participants were female (75%). Their specialty practice included medical oncology (26%), clinical research (26%), advanced care practitioner (21%), surgery (21%), basic science or translational research (11%), nursing (11%), pathology (5%), population/outcomes research (5%), radiation therapy (5%), and other area (11%).
Breast cancer providers evaluated 66 clinical variables and categorized them according to degree of clinical utility at point-of-care for inclusion in the BCD (Table 1). Among the 39 variables considered to be "must have" variables, 17 were timeline, 13 were tumor-specific, and nine were patientspecific variables. No prognostic tool variables were considered to be "must have" variables. Among 23 "would like to have" variables, six were patient-specific, six were prognostic tools, two were tumor-specific variables, three were timeline variables, and six were other variables. One timeline variable and three tumor-specific variables were categorized as "unnecessary": cost/charges incurred during therapy, mitotic count, and Scarff/Bloom score. Table 2 summarizes the feasibility of gathering the identified clinical variables from the EDW and UCCR databases for inclusion in the BCD. The majority of "must have" variables (87%) were accessible from the UCCR, EDW, or both databases. Among the 39 "must have" variables, BRCA testing and status, ECOG performance status, date and site of recurrence, disease free survival, and treatment delay information were difficult to access. Among the 23 "would like to have" variables, Adjuvant! Online and genetic profiling such as Oncotype DX and MammaPrint were not available. Several variables, including ability to calculate Adjuvant Online score, risk of recurrence for similar patients treated at HCI, progression free survival of similar patients treated at HCI, and appropriateness for clinical trials information, were difficult to access or unavailable in both databases.
Feedback from patients from the focus group is described here: (1) Patients wanted to know the outcomes of other patients who made the same treatment choices that they are currently contemplating. Instead of knowing the outcomes of treatment options in the general population, knowing the outcomes of patients with similar characteristics is more helpful in the decision-making process.
(2) Every patient is unique, and each has a different situation. Many factors and variables that influence the risks and benefits of treatment may be important to some patients and not to others. Therefore, identifying characteristics that are most influential to outcomes is important. (3) Outcomes other than survival are important. For example, patients wanted to know how well treatments are tolerated. Patients would also like to know the risk-benefit trade-offs of different treatment options. (4) Patients have different preferences for the information provided, and patient information should tailor evidence to patient preferences. Patients also requested this information be made available in a format that they can take home with them, so that they can think about the information away from the emotionally charged environment of the hospital. (5) Patients are interested in not only the immediate outcomes of treatments but also want to know about the long-term consequences of treatments.
For the breast cancer cohort, we identified 4,570 female adult patients from the EDW database and 3,606 patients who had confirmed breast cancer staging in the HCI Tumor Registry. Among these patients, 3,540 had reported cancer stage information within three months of index date and were included in the study population. Table 3 summarizes the baseline characteristics of the breast cancer cohort. Patients with stage I cancer were oldest, followed by patients with stage IV, stage II, and stage III breast cancer. Among patients with more advanced cancer, a higher proportion had a higher CCI score and were more likely to have Medicaid insurance.
During the study period, 965 of 3,540 patients died (27.3%). The mortality rates were 17.7% for stage I, 27.6% for stage II, 37.3% for stage III, and 72.6% for stage IV. Patients initially diagnosed with stage I, II, III, and IV breast cancer had a mean survival of 7.0 (SD 4.5), 6.9 (SD 4.5), 5.3 (SD 4.0), and 2.8 (SD 2.8) years, respectively, and a median survival of 6.3, 6.1, 4.2, and 2.1 years, respectively. Figure 1 shows survival curves among patients with different cancer stages. When all patients are included in the Cox model, older age, higher CCI and advanced cancer stage are statistically significantly associated with increased risk of mortality (Table 4). Older age and advanced cancer stage are associated with increased risk of mortality. When the model is stratified by cancer stage, age category ≥75 years consistently significantly predicts the increased risk of mortality for all cancer stages, but the magnitude of effect size decreases with advanced stage. However, the age category of 65 to 74 years is only associated with increased risk of mortality in patients with either stage I or II breast cancer. Similarly, CCI score ≥ 3 is only significant for mortality among patients with stage I or II cancer. The survival curves with 95% confidence band for the selected stage III clinical scenario is shown in Figure 2. This clinical scenario is generated by the BCD: patients with stage III breast cancer aged 65-74 years with CCI ≥ 3. The graphical user interface displayed by the BCD is shown in Figure 3. The view provides a visualization of survival, using both a traditional survival curve and pictograms of survival probablity at 2, 5, and 10 years. As shown in Figure 3, the dashboard tool can display   survival probability and survival curve by patient age and tumor stage. For this prototype, data fields are selected using drop-down menus and matched to the correct JavaScript Object Notation (JSON) data. In the future, the dashboard will include the "must have" and "available" variables to predict survival.

Discussion
Our goal for this research project was to identify variables of high clinical utility to providers for real time discussion with their patients and to evaluate if these variables can be readily accessed in our current electronic health record system and related databases. With this information at hand, we have built a tablet-based BCD prototype that has the capability to present overall survival information for a particular a clinical scenario, which is defined by age and co-morbidity indices. According to focus meeting discussion and the questionnaire results, the "must have" variables for inclusion in the BCD should include patient-specific variables, tumor-specific variables, and timeline variables. Most of these variables were accessible in the EDW, UCCR, or both databases. We evaluated the accessibility of these specified clinical variables in active, comprehensive clinical databases. Five variables were "difficult to access" or "unavailable" in both databases. We performed clinical scenario-based patient analyses by disease staging to determine survival outcome and prognosis. Finally, we built a simple, graphical user interface for the BCD tool. The primary intent of this tool is to support MDM between health care providers and patients.
Overall, we conclude that it is feasible to build the BCD tool using our institution's databases (EDW and UCCR). Other similar tools have been used in provider patient discussions around breast cancer treatment options. The clinical variables identified via the provider meetings are similar to those previously reported (Baumgart, Postula, & Knaus, 2015;Hogarth et al., 2010;Mandelblatt, Kreling, Figeuriedo, & Feng, 2006;Montazeri, Montazeri, Montazeri, & Beigzadeh, 2016;Rejali, Tazhibi, Mokarian, Gharanjik, & Mokarian, 2015;Tammemagi, Nerenz, Neslund-Dudas, Feldkamp, & Nathanson, 2005). For their Communication and Care Plan tool to support shared decision-making, Hogarth et al. (2010) also determined a need for information about histopathology, tumor characteristics, and receptor status. Montazeri et al. (2016) demonstrated the value of survival prediction and MDM with their their rule-based classification model. Although our BCD tool does not utilize machine learning algorithm, we similarly used historical cases and then built survival plots that represent real world practice and outcomes of patients with breast cancer treated at our institution. The Nottingham Prognosis Index has also been evaluated as a decision-making tool and determined to be capable of distinguishing good, moderate, and poor survival rates (Rejali et al., 2015). It was limited in its ability with more specific groups, such as patients with moderate and poor prognosis, and did not include comorbidity information. We included comorbidity data in our BCD tool, because comorbidity has been shown to independently affect survival and is important to consider in an aging population with longer life-expectancy (Tammemagi et al., 2005). In addition, shared decisionmaking has been shown to have an important role in the care for older women with breast cancer (Mandelblatt et al., 2006). The mean age of patients included in our breast cancer cohort used to build the BCD was 57.4 years (SD 13.7). With the current 5-year overall survival of breast cancer patients estimated to be 89.4%, we anticipate that patients with breast cancer will live longer and reap further benefit from tools that directly support MDM activities (SEER Stat Fact Sheets: Breast Cancer, 2015). Our methodology is unique in being simultaneously specific to the electronic health care data system of a cancer hospital (HCI) and patient population (breast cancer). The tool we have created can (1) meet the request by patients at HCI wanting health outcome information about similar patients treated at HCI and (2) support our breast care providers as they provide this information in a real world setting where efficiency and productive time-spent-with-patient during a clinic encounter are highly valued.
Results from survival analyses by clinical scenario demonstrated that patients in stage I had the highest survival rate (82.3%) and the longest mean life years of 7.0 (SD 4.5) years. This was followed by stage II (72.4%, 6.9 years, SD 4.5), stage III patients (62.7%, 5.3 years, SD 4.0) and stage IV (27.4%, 2.8 years, SD 2.8). Our results indicate that Charlson Comorbidity Index is associated with increased risk of death among patients with stage I and II breast cancer, but it was not significant among stage III and IV patients. Overall, the influence of number of comorbidities on mortality is outweighed by the poor prognosis associated with burden of disease in advanced breast cancer. We observed age to perform similarly, since its effect on mortality diminishes with more advanced disease. Conversely and as we anticipated, patients with stage I breast cancer aged 65-74 years with http://dx.doi.org/10.1080/2331205X.2016.1234661 CCI < 3 have much improved survival rates compared to patients with stage III breast cancer at the same age but with CCI ≥ 3.
To populate our table-based BCD prototype we developed simple and informative survival plots built from our institution-specific clinical databases to facilitate medical-decision-making at pointof-care. In the next phase of development, we will incorporate treatments for breast cancer into the survival plots. In addition, we hope to include other important patient-reported outcomes, such as treatment-related side effects and patient reported quality of life. Ideally, predictive survival plots using the identified clinical variables can be developed. Provider input into the clinical utility of the dashboard prototype was critical, as was accessibility of these variables for real time presentation. These factors are important to consider in development of real time information based tools intended to be used in direct patient care settings. Ultimately, our goal is to implement the BCD tool on a larger scale.
A point-of-care clinical information tool, such as the BCD, can bring together the necessary components of care and improve quality of care, including early detection, diagnosis, treatment, and survivorship (Hesse & Suls, 2011). Built from a platform that includes comprehensive clinical information of similar patients and similarly treated patients at our institution, the BCD will therefore be able to provide local, real-time MDM-oriented information to inform individualized patient care. As a result, there would a be shift from the current fragmented, reactionary approach to patient care to an approach that is more predictive, preventive, personalized, and collaborative.
There were several limitations in this project. First, the goal of developing a clinical tool to provide local, real-time prognostic information to patients being treated at HCI limits the generalizability to other populations. Second, patients may seek care at other health care facilities. At present, the BCD does not capture this information, and thus, some patients may be misclassified. Third, retrospective extraction of data is subject to missing or incomplete data. For example, only 22.4% patients had reported race. Fourth, in the study we did not include breast cancer treatment. Feedback from the patient-focus group meetings, such as wanting to know about tolerability of treatment, is valuable to patients and to health care providers. We plan to address this limitation in future studies when we capture all treatments and side effect profiles. Fifth, the survival plots that were developed were based on descriptive analyses. Both descriptive and predictive survival plots are planned for future studies. This would provide more complete information to support the MDM process.