The effect of social deprivation on clinical outcomes and the use of treatments in the UK cystic fibrosis population: a longitudinal study

Summary Background Poorer socioeconomic circumstances have been linked with worse outcomes in cystic fibrosis. We assessed whether a relation exists between social deprivation and individual's clinical and health-care outcomes. Methods We did a longitudinal registry study of the UK cystic fibrosis population younger than 40 years (8055 people with 49 337 observations for weight, the most commonly collected outcome, between Jan 1, 1996, and Dec 31, 2009). We assessed data for weight, height, body-mass index, percent predicted forced expiratory volume in 1 s (%FEV1), risk of Pseudomonas aeruginosa colonisation, and the use of major cystic fibrosis treatment modalities. We used mixed effects models to assess the association between small-area deprivation and clinical and health-care outcomes, adjusting for clinically important covariates. We give continuous outcomes as mean differences, and binary outcomes as odds ratios, comparing extremes of deprivation quintile. Findings Compared with the least deprived areas, children from the most deprived areas weighed less (standard deviation [SD] score −0·28, 95% CI −0·38 to −0·18), were shorter (–0·31, −0·40 to −0·21, and had a lower body-mass index (–0·13, −0·22 to −0·04), were more likely to have chronic P aeruginosa infection (odds ratio 1·89, 95% CI 1·34 to 2·66), and have a lower %FEV1 (–4·12 percentage points, 95% CI −5·01 to −3·19). These inequalities were apparent very early in life and did not widen thereafter. On a population level, after adjustment for disease severity, children in the most deprived quintile were more likely to receive intravenous antibiotics (odds ratio 2·52, 95% CI 1·92 to 3·17) and nutritional treatments (1·78, 1·44 to 2·20) compared with individuals in the least deprived quintile. Patients from the most disadvantaged areas were less likely to receive DNase or inhaled antibiotic treatment. Interpretation In the UK, children with cystic fibrosis from more disadvantaged areas have worse growth and lung function compared with children from more affluent areas, but these inequalities do not widen with advancing age. Clinicians consider deprivation status, as well as disease status, when making decisions about treatments, and this might mitigate some effects of social disadvantage. Funding Medical Research Council (UK).

This data supplement contains additional information on the methods employed in this study. In addition, selected tables, plots and results are presented to aid interpretation of the results.

Design, setting and data source
The UK registry is administered by the U.K. Cystic Fibrosis Trust, and records information about the health and treatment of patients from birth. Data are now routinely collected in a standardized fashion at over 50 British cystic fibrosis specialist centres. Patients attending the British centres are seen in the outpatient clinic for a comprehensive annual review, including evaluation of clinical status, pulmonary function, microbiology of lower respiratory tract secretions, and use of CF major CF related therapies.
The registry data have been used previously in a number of epidemiological studies in CF 1, 2 . The data "cut" utilised in this study contains data collected between 1996 and 2010, and has been through rigorous quality control by data managers at the CFTrust, and external consultants at Imperial College, London, who prepare the annual review reports. This includes screening for removal of duplicates, and tracking of patent transition from paediatric to adult centres. Deaths are verified by checking with ONS. In 2000, the dataset was estimated to contain biographical information on over 92% of the estimated UK CF population 3 , and registrations have increased year on year subsequently. Furthermore the CFTrust have written to every paediatrician and adult chest physician in the UK to obtain data on CF patients, and on this basis the estimated coverage is above 99% (personal communication, Diana Bilton).
The UK registry started as the UK CF Database, which was established at the University of Dundee, Scotland in 1995. Initially data were collected from 56 paediatric and adult CF clinics, using standardised forms, and validated through the system of double data entry, range checking and error correction 3 . In 2005 the data collection system changed from a paper based return system to utilise the online PortCF software used in the US registry. During this transfer there was extensive retrospective data cleaning and checking, undertaken by independent contractors. The UK CF Registry and its current software programme, Port CF, is now in its fifth year with the production of four annual reports 4 . Data are collected in over 200 fields, and the number of patients for whom a complete data set was recorded was 82% in 2009, and this has increased year on year 5 . The coverage for core variables such as weight and %FEV1, used in this analysis is higher, and almost all of the people fulfilling the study inclusion criteria had data in these fields (figure E1).

Entry criteria
The analysis did not include data for people aged >=40 years for pragmatic reasons. Only a small proportion of data are available for people over the age of 40: 5% of the annual reviews occurred in patients >=40 years. Including this data would extend the age range for the analysis up to 78 years of age. We chose to apply an upper age limit to the analysis since we have previously shown that random intercept and slope models make unrealistic assumptions when applied over long periods 6 .

Primary outcome and covariates
Pulmonary function tests were performed according to international recommendations 7 , measuring forced expiratory volume in one second, expressed as a percentage of predicted values for sex and height using reference equations from Wang or Hankinson 8,9 . Supplemental nutritional support included patients receiving supplements orally, by nasogastric tube, gastrostomy tube, jejunal tube or total parenteral nutrition (TPN). Any inhaled antibiotic therapy included Tobramycin solution for inhalation, other inhaled aminoglycoside, Colistin and Promixin.

Deprivation scores as a measure of small area SES
The indices of multiple deprivation in the UK are widely used as measures of SES in epidemiological studies [10][11][12] and are recommended for tracking health inequalities in UK government statistics 13 . Indices of multiple deprivation combine economic, social and housing indicators measured at the census into a composite deprivation score for small areas in the UK constituent countries 14 . There were 41500 of these small areas in the UK, containing on average 1400 people (range 500-3700). All of these small areas were ranked on the basis of the continuous deprivation score, and then divided into fifths, or "quintiles", providing the following approximate cut-off points for normative deprivation quintiles: <8.31; 8.32 to 13.81; 13.82 to 21.20; 21.21 to 34.11, >34.11. The IMD methodology allows much finer resolution than analyses using ZIP codes in the USA, which contain on average 30 000 people 15 . We used the postcode first recorded at entry to the dataset to link an individual to an IMD score, in order to generate a fixed measure of SES.

Statistical Methods
Statistical analysis was undertaken using R (version 2.9.2 for mac), and the lme4, survival, Hmisc, memisc, mcgv and ggplot2 packages.
Kaplan-Meier estimates and Cox regression were used to assess the effect of deprivation on time to diagnosis. For the analysis of continuous outcomes (e.g. weight, height, BMI, %FEV1, IV days), we first visualised the data using spaghetti plots of individuals" measurement sequences together with nonparametric smoothed means, in order to determine the provisional model mean trajectory (see figure  E10). In order establish the shape for time trends we plotted the unadjusted population average trend, and looked at the GAMs. We then approximated these trends using linear functions (e.g. %FEV1 in the younger age group), piecewise or broken-stick functions (weight, BMI), or quadratics (e.g. any IV therapy) as appropriate. This is illustrated in figure E5 below.
Repeated measures on individuals are correlated, and this must be accommodated to obtain valid inferences. To analyse the continuous-valued outcomes (weight, height and FEV 1 ) we used a linear mixed model 16 . Specifically, denoting by Y ij the jth repeated measurement on the ith individual and t ij the age at the time of measurement, we assumed that where the μ ij are the expectations of the Y ij and are described by a multiple linear regression model, the (U i ,V i ) pairs are subject-specific intercepts and slopes, modelled as zero-mean bivariate Normally distributed random variables independently realised for different subjects, with means zero, variances u 2 and v 2 and correlation ρ, and the Z ij are residuals modelled as mutually independent, Normally distributed random variables with mean zero and variance τ 2 . This special case of the linear mixed model implies that the variance of the Y ij increases with age, t, as the quadratic function To analyse the binary outcomes (PA status, use of therapies in past year), we used a generalized linear mixed model. This specifies a logistic regression model for the effects of covariates on the probability of, for example, pseudomonas acquisition, but adjusts the standard errors of the regression parameters to take account of the correlation structure of the repeated measurements in the same way as described above for the linear mixed model.
We first examined univariate associations between covariates and the population mean outcomes over time, then developed a multivariate model and assessed the need for interactions. We also explored treating deprivation as a continuous term or as a factor. Although IMD is measured on a continuous scale, for descriptive summaries we have followed the common practice of grouping IMD into quintiles. However, reducing IMD to a categorical variable loses information, and also leads to models that are difficult to interpret, especially when this five-level categorical variable interacts with nonlinear time effects. Where we retained IMD as a continuous variable, the fitted beta coefficients for IMD score were then used to summarise the effect of deprivation by comparing a person in the midpoint of the most deprived quintile to one in the mid-point of the least deprived quintile.

Results
Population characteristics Figure E1: Flowchart showing people included in the primary weight analysis. After applying eligibility criteria there was very little missing data in the final complete case analysis. An age based cut off is used to stratify the analysis, and people with data straddling 18 years of age can thus contribute to both analyses.    5750 5745 * p < 0.05, ** p < 0.01, *** p < 0.001 Standard errors in parentheses, birthyear coefficients not shown The deprivation effect is multiplied by 58 to generate the contrast between the mid point of the least and most deprived quintiles age2 is the coefficient for the split line at age three in the weight and BMI analysis Figure E5: Piecewise modelling approach to weight z score trajectory fitted by OLS compared to smoothed mean. The smoothed mean weight z score increases to around age three and decreases subsequently. This was modelled as a piecewise regression, with a "knot" at age three. A similar approach was taken to modelling BMI z score.

Figure E6: Modelled growth trajectories for children, comparing least (blue) and most deprived quintiles (red).
These plots illustrate the contrast between deprivation quintiles. The trajectories are plotted at the reference values for other covariates in the final regression models: female sex, homozygote delta F508 carrier, not diagnosed by screening, white, born in 1991. Weight SD scores increased from the time of diagnosis to around age three, and then decreased. This is modelled as a split straight line with a knot at age three.     The deprivation effect is multiplied by 58 to generate the contrast between the mid point of the least and most deprived quintiles age2 is the coefficient for the split line at age three in the weight and BMI analysis In a supplementary analysis we tested for an SES and screening interaction, and although the point estimate was in the direction that supports a narrowing of inequality with screening, it was not significant.

Figure E11: Generalised additive models (GAMs) showing the shape of the relationship between %FEV1 (upper panel), risk of any IV therapy (lower panel)
, and deprivation score. %FEV1 decreases with increasing deprivation, and there is a dose-response relationship. Risk of any IV therapy increases with increasing deprivation, also in a graded fashion.

Changing deprivation scores
Over the study period 18% of eligible individuals had more than one postcode recorded. As a robustness check, we repeated the analysis for %FEV1, treating SES as a time-varying covariate, but this did not materially alter the result (table E5).

Adjustment for clustering by CF centre
Differences between centres may mediate some of the effects of socioeconomic status on outcomes, and explain some of the differences in treatments received. In order to explore this we replicated the final models for %FEV1, and for any IV therapy, adding in care centre as a fixed effect. This made no difference to the deprivation effect (table E5).

Excluding data pre-2000
Excluding the data pre-2000, when recruitment to the cohort was increasing over time, made no difference to the deprivation effect (