Can different definitions of date of cancer incidence explain observed international variation in cancer survival? An ICBP SURVMARK-2 study

Background: Differences in registration practices across population-based cancer registries may contribute to international variation in survival estimates. In particular, there are variations in recorded date of incidence (DOI) as cancer registries have access to different sources of information and use different rules to determine an official DOI. This study investigates the impact of different DOI rules on cancer survival estimates. Materials and methods: Detailed data on dates of pathological confirmation and hospital admittance were collected from three registries participating in the ICBP SURVMARK-2 project (England, Northern Ireland and Norway). Multiple dates of incidence were determined for each cancer patient diagnosed during 2010–2014 by applying three sets of rules that prioritize either: a) histological date, b) hospital admittance date or c) the earliest date recorded. For each set of rules and registry, 1and 5-year net survival were estimated for eight cancer sites (oesophagus, stomach, colon, rectum, liver, pancreas, lung and ovary). Results: The mean difference between different DOIs within a country and cancer site ranged from 0.1–23 days. The variation in 1and 5-year net survival using different DOIs were generally small for all registries and cancer sites. Only for liver and pancreatic cancer in Norway and ovarian cancer in England, were larger 1-year survival differences, of 2–3 % found. Conclusion: In the ongoing discussion of the comparability of survival estimates across registry populations, the use of different DOI definitions can be considered to have a very limited impact.


Introduction
Several large international studies have repeatedly documented persistent differences in cancer survival across countries and cancer sites [1][2][3]. The International Cancer Benchmarking Partnership (ICBP) was established in 2009 to explore factors that contribute to this variation [4]. The ICBP SURVMARK-2 project is part of the second phase of ICBP, dedicated to increasing the understanding of survival differences and focusing on eight cancer sites [5]. In an overview paper, using data from 21 participating jurisdictions in seven high-income countries between 1995 and 2014, ICBP SURVMARK-2 reported improved survival over time, but still persistent differences in survival for seven cancer sites [6].
Possible explanations for the observed international variation in cancer survival are many and continue to make survival benchmarking challenging. While stage at diagnosis and treatment are obvious key determinants of patient outcomes, differences in registration practices across population-based cancer registries are considered another source https://doi.org/10.1016/j.canep.2020.101759 Received 23 March 2020; Received in revised form 27 May 2020; Accepted 30 May 2020 of variation in survival estimates [7]. Previous research has investigated different aspects of this, including the impact of a failure to link cancer cases to death registries, of missing long-term survivors, of cases only notified from death certificatesoften excluded from survival analysisand of finding dates of cancer recurrence rather than of new diagnoses [8][9][10][11].
Internationally agreed standards to determine date of incidence (DOI) exist to facilitate harmonizing and comparability across time and jurisdictions. Yet, variations in these rules exist. For example, the European Network of Cancer Registries (ENCR) prioritizes date of pathological confirmation before date of hospital admittance [12], whereas the International Agency for Research on Cancer (IARC) prioritizes date of hospital admittance to date of pathological confirmation [13]. In addition, some registries, like Norway, use their own definitions of DOI. Furthermore, even in a situation where registries use the same registration standards, differences might still exist due to variations in basis of diagnosis, i.e., choices of which source of information is used to decide the official DOI.
In a more recent paper that was part of the first phase of ICBP, the issue of using different definitions of date of incidence (DOI) was considered to be a contributing factor to variations in survival [14]. This paper thus seeks to address this issue in greater depth using detailed data obtained from three cancer registries participating in the second phase of ICBP. Using detailed information on relevant dates, the objective of this study is to assess the impact of applying different rules to determine DOI on survival estimates up to 5 years after diagnosis for eight cancer sites diagnosed in 2010-2014.
Three of the registries (England, Northern Ireland and Norway) sent additional detailed data on recorded dates. While England and Northern Ireland provided recorded dates of pathological confirmation and dates of hospital admittance due to the cancer of interest, Norway additionally provided a clinical date of diagnosis reported in standardized forms by clinicians working in the specialist health service. The hospital admittance dates provided by England and Northern Ireland included inpatients only, whereas the hospital admittance date provided by Norway included both inpatients and outpatients. As hospital admittance dates were only available from 2009 in Norway and the completeness of pathology dates in England were highest from 2010 (possibly to due a gradual implementation of electronic reporting), only cases with an official DOI in the period 2010-2014 were selected for analyses. Furthermore, only cases where at least one of the dates, pathological or admission, were available were considered eligible for analyses. Table 1 shows the number and proportions of the total number of cases that were included in the analyses. Cases excluded typically had a clinical basis of diagnosis registered as the official basis. Most likely these cases did not have an admission date, but recorded a date from other clinical sources (for example from Cancer Outcomes and Services Dataset (COSD) in England). The ICBP SURVMARK-2 overview paper by Arnold et al. has further details on inclusion criteria and the quality controls used in developing the final dataset [6].
National numbers on all-cause mortality for the period 2010-2014 were obtained from the following sources: Office for National Statistics (England), the General Registrar's Office for Northern Ireland and the Norwegian Institute of Public Health. Ethical approval for the ICBP SURVMARK-2 project was obtained from each participating registry, whenever required, as well as from the IARC Ethics Committee.

Definitions of date of incidence
From the datasets provided from the three registries we generated DOIs based on three different sets of rules:  * Admittance date available for inpatients only in England and Northern Ireland and both inpatients and outpatients in Norway. **The numbers included refer to those cases where at least one of the two listed dates were available.
rules corresponds to the official recorded DOI. For Norway, none of the above rules corresponds to the official DOI as this is determined by using the earliest available date of the pathological date and clinical date only.

Statistical analyses
Site specific age-standardised net survival at 1 and 5 years after diagnosis were estimated using the Pohar Perme estimator, implemented in the user-written Stata command stnet- [15,16]. Background mortality in the general population of each jurisdiction was obtained from lifetables of all-cause death rates by sex, single year of age and calendar year. Age-standardisation was carried out using international cancer survival standard (ICSS) weights [17]. The complete approach was used to get estimates for the period 2010-2014 combined [18]. Complementary to these survival estimates, descriptive statistics are provided, including the mean, median and 25th-75th percentiles of the differences between the DOIs according to the different rules, measured in number of days. Since the impact of different DOIs on survival did not depend on sex, only estimates for both sexes combined are presented.

Results
There were 502 601, 16 731 and 45 459 cases in England, Northern Ireland and Norway, respectively, diagnosed with oesophagus, stomach, colon, rectal, liver, pancreas, lung or ovarian cancer between 2010 and 2014, potentially eligible for survival analyses. Including only those cases where at least a pathology date or a hospital admittance date were available meant that 465 439 (93 %) cases in England, 16 190 cases (97 %) in Northern Ireland and 44 573 cases (98 %) in Norway were included for analyses in this study. The cases excluded originated in older patients, with a median age at diagnosis of 78, 80 and 81 years for England, Northern Ireland and Norway, respectively. This is compared to a median age of 72, 71 and 70 years, respectively for the included cases. Table 1 shows the number of cases with a pathology date, a hospital admittance date and with at least one of the two dates available, for each country and cancer site. Generally, hospital admittance dates were available for a large proportion of the eligible cases. The same is true for pathology dates, except for liver and pancreatic cancer, where the proportion of eligible cases with a pathology date is markedly lower (ranging between 34.3 % for liver cancer in England and 68.3 % for pancreatic cancer in Norway).
To understand the potential impact of using different definitions of DOI on survival, we compared the difference in number of days between the ENCR and IARC rules, and between the ENCR and ED rules. Table 2 shows the mean difference in days with the corresponding standard deviation, and also the median difference with the corresponding 25th and 75th percentile. Overall the mean differences were small, ranging from 0.1 days to 23 days, and tended to be largest in England (2.1-23 days) and smallest for Northern Ireland (0.1-1.6 days). The variation, measured in standard deviations, was also largest in England. The median difference was zero or one day across all cancer sites and countries. There was also minor variation in the 25th and 75th percentiles, where most of these estimates were zero and were within the range ± 14 days. The differences between the ENCR and IARC rules were generally greater than those between the ENCR and ED rules; the observed differences did not vary much across cancer sites. Table 3 shows point estimates and corresponding 95 % CIs for 1and 5-year age standardised net survival for all combinations of cancer sites, countries and different definitions of DOI. The variation in 1-year net survival across different definitions within the same site and jurisdiction were generally small. The biggest difference in England was found for ovarian cancer, where the 1-year net survival was 68.8 % using the IARC rule, compared to 70.6 % and 70.7 % using the ENCR and ED rules, respectively. For all other cancer sites, the absolute difference in 1-year survival, for England, were smaller than 1.5 percentage points. For Norway the largest difference in 1-year net survival across the different DOI rules was found for pancreatic and liver cancer. Using the ED rules, the estimates were 32.4 % and 41.7 % for pancreas and liver, respectively, whereas using the ENCR rules, the estimates are lowered by 2.2 and 3.3 percentage points, to 30.2 % and 38.4 %, respectively. For the six other cancer sites, the absolute difference in 1year estimates in Norway were smaller than 1.5 percentage points. For Northern Ireland there was almost no variations in 1-year net survival estimates, with absolute differences lower than 0.5 percentage points. Corresponding variation in 5-year net survival across the different sets of dates were generally smaller than those seen for 1-year net survival.

Discussion
The main finding of this study is that the within-registry variations in 1-year net survival estimates across different definitions of DOI are small and tend to be even smaller when 5-year survival estimates are considered. The fact that this observation is seen in all three registries across eight cancer sites indicates that survival estimates are robust to differences in recorded DOI. Further, the ICBP SURVMARK-2 study mainly includes cancer sites with moderate to poor survival where we would expect the effect of different definitions of DOI to have the largest impact.
The Norwegian Cancer Registry (NCR) is different from the registries in England and Northern Ireland in that it follows the ED rule for determining DOI. A recent paper by Eden et. al [14] reported that NCR records the DOI from a physician in either primary or secondary care, and when such practice is simulated using regional English data this earlier recorded DOI could have a large impact on survival estimates. While that study reported that the reduction in 1-year net survival for lung cancer could be as large as 7.3 percentage points if an ED rule, based on physician's diagnosis, was used instead of the ENCR rule (pathological confirmation), these simulated results do not fully apply to the Norwegian health care and data collection context. The NCR receives notifications directly from clinicians working in the specialist healthcare service, not in primary care, with some exceptions for skin and melanoma cancers. In this study, we have shown using NCR data that the survival estimates using the ED rule are similar to those using the ENCR rule, because clinicians often wait for histological verification, commonly recording this date as the date of diagnosis. Specifically, for lung cancer the estimated difference between England and Norway is only reduced by 1.1 percentage points when applying the ENCR rules to Norwegian data. This shows that knowledge of the data collection processes is needed, and that utilisation of local data in such analyses is important.
The hospital admittance dates include only inpatients for England and Northern Ireland. For Norway, this date also includes outpatients, although it is worth noting that the survival variations are not dissimilar in Norway and England. One may speculate that there are resulting differences in the definition of inpatients between the two countries, with day surgery considered inpatient treatment among a subset of cases. Still, the findings indicate that whether registries are more similar to Norway or England in practice, having access to admittance dates for inpatients only, or for both inpatients and outpatients, is not crucially important when estimating and comparing survival.
In this study we only used dates of pathological confirmation and hospital admission, as well as a clinical notification date for Norway, to look at effects of different definitions of DOI on survival estimates. Fig. 1 illustrates the sources of information available to the 21 registries participating in ICBP Phase-2. Although pathology/haematology and in-patient records are the two most widely used sources, most registries receive (additional) information on cases from other sources, primarily source information related to treatment and imaging as well as from death certificates. However, many registries do not routinely store these different dates since typically a DOI from a lower priority source (e.g. treatment) gets overwritten once a DOI from a higher priority source becomes available (e.g. pathology). For this reason, the comparative analyses undertaken in this study are difficult to replicate on a wider scale, and thus we were only able to compare three of the 21 high quality cancer registries involved in the ICBP SURVMARK-2 project.
The detailed information on dates in this study did not always come from routinely recorded and quality-assured data. Dates that come directly from pathology and hospital admittance data in England are extracted from electronic records, without undergoing routine quality assurance. The same is true for the dates on hospital admittance in Norway, which are electronically received from the Norwegian Patient Registry. It may thus be that for some cases, the recorded date is actually a date related to a recurrence and not the primary cancer. It may also be that some patients, although being investigated for cancer symptoms, are hospitalised due to other medical conditions, so that the recorded date of hospitalisation due to cancer is later than it should be. As an example, in the data delivered from England, the maximum difference between the pathology and admission date was 3992 days, or approximately 11 years. This might indicate that, for this particular case at least, the admission date refers to the primary cancer and the pathology date refers to a later recurrence. These features of the data could, at least partly, explain why the mean differences between DOIs are largest in England. However, these issues most likely affect a very small proportion of cases; in the English data, the proportion of cases where the absolute difference between pathology and admission date is larger than 100 days is less than 4%, while for the Norwegian data the proportion is only 1.3 %. Thus, we believe further quality assurance of the data used in this study would make only negligible differences to the results presented here.
This study focuses on the impact of differences in DOIs on survival. Oher potential differences in registration routines could affect survival estimates. Examples of such differences are the completeness of case ascertainment, the intensity of trace-back of death certificate initiated (DCI) cases and the accuracy of tracking vital status. Forthcoming publications from the ICBP SURVMARK-2 project will address several of these issues.

Conclusion
The findings from this study indicate that, at least in countries with universal access to healthcare and with high quality cancer registries, different definitions of DOI only marginally affect estimates of net survival, and have only a minor bearing on survival comparability in benchmarking exercises. We recommend that, when feasible, registries routinely record multiple dates from primary sources, not only as means to replicate these findings, but to support a comprehensive approach to data quality evaluation at their institutions.