Geospatial disparities in survival of patients with breast cancer in sub-Saharan Africa from the African Breast Cancer-Disparities in Outcomes cohort (ABC-DO): a prospective cohort study

Summary Background There is an urgent need to improve breast cancer survival in sub-Saharan Africa. Geospatial barriers delay diagnosis and treatment, but their effect on survival in these settings is not well understood. We examined geospatial disparities in 4-year survival in the African Breast Cancer-Disparities in Outcomes cohort. Methods In this prospective cohort study, women (aged ≥18 years) newly diagnosed with breast cancer were recruited from eight hospitals in Namibia, Nigeria, South Africa, Uganda, and Zambia. They reported sociodemographic information in interviewer-administered questionnaires, and their clinical and treatment data were collected from medical records. Vital status was ascertained by contacting participants or their next of kin every 3 months. The primary outcome was all-cause mortality in relation to rural versus urban residence, straight-line distance, and modelled travel time to hospital, analysed using restricted mean survival time, Cox proportional hazards, and flexible parametric survival models. Findings 2228 women with breast cancer were recruited between Sept 8, 2014, and Dec 31, 2017. 127 were excluded from analysis (58 had potentially recurrent cancer, had previously received treatment, or had no follow-up; 14 from minority ethnic groups with small sample sizes; and 55 with missing geocoded home addresses). Among the 2101 women included in analysis, 928 (44%) lived in a rural area. 1042 patients had died within 4 years of diagnosis; 4-year survival was 39% (95% CI 36–42) in women in rural areas versus 49% (46–52) in urban areas (unadjusted hazard ratio [HR] 1·24 [95% CI 1·09–1·40]). Among the 734 women living more than 1 h from the hospital, the crude 4-year survival was 37% (95% CI 32–42) in women in rural areas versus 54% (46–62) in women in urban areas (HR 1·35 [95% CI 1·07–1·71] after adjustment for age, stage, and treatment status). Among women in rural areas, mortality rates increased with distance (adjusted HR per 50 km 1·04, 1·01–1·07) and travel time (adjusted HR per h 1·06, 1·02–1·10). Among women with early-stage breast cancer receiving treatment, women in rural areas had a strong survival disadvantage (overall HR 1·54, 1·14–2·07 adjusted for age and stage; >1 h distance adjusted HR 2·14, 1·21–3·78). Interpretation Geospatial barriers reduce survival of patients with breast cancer in sub-Saharan Africa. Specific attention is needed to support patients with early-stage breast cancer living in rural areas far from cancer treatment facilities. Funding US National Institutes of Health (National Cancer Institute), Susan G Komen for the Cure, and the International Agency for Research on Cancer.


Data assembly
a. Road network.We assembled the latest road network for each country from OpenStreetMap [1].The road network was reclassified into four classes, namely primary, secondary, tertiary, and minor, based on the road attributes data from OpenStreetMap (see Figure S1a). 1 b.Land use: We used land cover data in areas where there was either no road network or data from OpenStreetMap might have been incomplete.The landcover information was based on Sentinel-2 satellite at 10 by 10 metres spatial resolution (see Figure S1b). 2,3c.Travel barriers.The barriers to travel considered were water bodies and protected areas.They were considered impassable except in the presence of a bridge where a road intersected a water body (see Figure S1b). 4.Digital elevation model: The slope of the land impedes walking and was obtained from Shuttle Radar Topographic Mission Digital elevation models at the 30m x 30m resolution. 5,6The walking speeds were corrected according to Tobler's formulation, an exponential function that describes how human walking speed varies with slope (see Figure S1c). 4,7delling travel times ][10] For example, in Tanzania, 11 Ethiopia, 12 Uganda, 13 Rwanda, 14 Ghana, 15 Niger, 16 Sierra Leone, 17,18 Togo, 19 Mozambique 20 and Namibia. 21Specifically, in order to compute travel time, a least-cost path algorithm that minimizes the total travel duration between each patient's residential location and hospital was implemented.This is distinct from an Euclidean distance because the least cost path accounts for the mode of transport and corresponding travel speeds used by each patient, including travel barriers.
The algorithm was implemented using AccessMod 5.7.17 (WHO, Geneva, Switzerland), a free, open-source, userfriendly tool supported by the World Health Organization (WHO) to analyse geographic accessibility. 7,22AccessMod imports user-provided geospatial datasets (see Data Assembly section) to informs the parameters required to calculate the travel time to the hospital. 7,22rst the "merge land cover module" in AccessMod 5.7.17 (WHO, Geneva, Switzerland), was used to overlay and merge the road network, landcover, water bodies and protected areas to obtain a single raster dataset to which different modes of transport were applied based on the patient records.Second, the least cost path (cost measured in terms of time) was invoked in the "accessibility module" of AccessMod 5.7.17 (WHO, Geneva, Switzerland) to compute the cumulative travel time from residences to the utilised facility by bringing together the merged layer, hospitals and travel speeds (see Figure S1d).The travel speeds assigned to each road class, landcover type and travel scenario were based on a review of from previous comparable studies. 8,23The analysis was conducted at 1 km spatial resolution considering one country (and its neighbouring countries where recruited patients resided) at a time.Note: HRs and 95% CIs estimated using flexible parametric survival models.Abbreviations: CI, confidence interval; HR, hazard ratio; SEP, socioeconomic position score.a Peak HR was defined as the highest lower CI.

IV. SENSITIVITY ANALYSIS: GIS-BASED RURAL VS. URBAN
The main analyses regarding rural vs. urban disparities in survival are based on a dichotomization of the following four options in the baseline questionnaire regarding the level of urbanization of their area of residence: city or town (urban) and village or rural (rural).
As a sensitivity analysis, we used GIS methods to classify residential neighbourhood into rural vs. urban.

Methods
To do this, we constructed a Global Human Settlement Model (GHS-SMOD-R2023A) by applying the Degree of Urbanisation Stage I methodology recommended by UN Statistical Commission. 1 It was generated by integrating builtup surface data (GHS-BUILT-S R2023) and population data (GHS-POP R2023); these data are available from the Joint Research Centre Data Catalogue every five years, therefore the year 2015 was selected, in order to overlap with the ABC-DO baseline data collection period. 2 The resulting gridded surface had 1 km spatial resolution and contained seven classes which describe a continuum of urbanicity, which we assigned the following ordered values to allow logical arithmetic on the raster: 3 very low density rural grid cell (value=1), low density rural grid cell (2), rural cluster grid cell (3), suburban or peri-urban grid cell (4), semi-dense urban cluster grid cell (5), dense urban cluster grid cell (6), and urban centre grid cell (7).We then created buffers of 3 km to approximate the neighbourhood around each household location and overlaid them on the gridded surface.The mean value for each buffer was then extracted.
Addresses with a mean urbanicity of ≥4 (the value corresponding to "suburban or peri-urban") were classified as urban while the rest (<4) were classified as rural.

Results
Overall, there was broad agreement between the two methods.A high proportion of self-reported rural areas were classified as rural using GIS-methods, and the proportions decreased with increasing urbanicity of the self-reported categories (town, city; Figure S8).The HRs, although attenuated in magnitude compared to those for self-reported rural vs. urban residence, show a similar pattern as the original analysis.

Discussion
Self-reported rural vs. urban residence may be more valid in the SSA context than GIS-based globally standardized definitions of urbanicity (used in a sensitivity analysis here), which have led to surprising results in Africa; for example, identifying 67 cities vs. 20 formally recognized in South Africa. 4,5Defined only by population size and density, they may not be suited to characterize the urban-rural continuum in SSA, 4,6 and misclassification could occur if, due to lack of formal addresses, some ABC-DO women reported recognizable landmarks in the nearest village.These would produce relatively small errors when calculating distance or travel time, but would misclassify rural women into more populated/urban areas.Note: HRs and 95% CIs estimated using Cox proportional hazards models stratified on study site/population (Namibia non-Black, Namibia Black, Uganda, Nigeria, South Africa, Zambia).Stratified HRs were produced from models with interaction terms between the main geospatial variable of interest and the effect modifier.Abbreviations: HR, hazard ratio; SEP, socioeconomic position score.

Figure S2 .
Figure S2.Kaplan-Meier survival curves for women diagnosed with breast cancer participating in ABC-DO (n=2101), by rural (orange) vs. urban (blue) residence, separate by study site/population .

Figure S1 .
Figure S1.GIS data used in the calculation of travel times (example: Uganda and neighbouring countries)

Figure S2 .
Figure S2.Kaplan-Meier survival curves for women diagnosed with breast cancer participating in ABC-DO (n=2101), by rural (orange) vs. urban (blue) residence, separate by study site/population

Figure S3 .
Figure S3.Hazard ratios (HRs) and 95% confidence intervals (CIs) for self-reported rural vs. urban residence in crude and adjusted models [orange], with and without additional adjustment for distance [grey] and travel time [black]

Figure S6 .
Figure S6.Hazard ratios (HRs) and 95% confidence intervals (CIs) for distance and travel time, stratified by self-reported rural vs. urban residence

Figure S7 .
Figure S7.Time-dependent HRs for self-reported rural vs. urban residence, overall and among women living ≥50km from the hospital