Analysis of electronic health record data of hepatitis B virus (HBV) patients in primary care: hepatocellular carcinoma (HCC) risk associated with socioeconomic deprivation

Objectives We set out to characterise chronic Hepatitis B (CHB) in the primary care population in England and investigate risk factors for progression to hepatocellular carcinoma (HCC). Study design Retrospective cohort study. Methods We identified 8039 individuals with CHB in individuals aged ≥18 years between 1999-2019 in the English primary care database QResearch. HCC risk factors were investigated using Cox proportional hazards modelling. Results Most of those with a record of CHB were males (60%) of non-White ethnicity (>70%), and a high proportion were in the most deprived Townsend deprivation quintile (44%). Among 7029 individuals with longitudinal data, 161 HCC cases occurred. Increased HCC hazards significantly associated with male sex (adjusted hazards ratio (aHR) 3.17, 95% Confidence Interval (95CI) 1.92-5.23), older age (for age groups 56-55 and ≥66 years of age, compared to 26-35 years, aHRs 2.82 (95CI 1.45-5.46) and 3.76 (95CI 1.79-7.9) respectively), Caribbean ethnicity (aHR 3.32, 95CI 1.43-7.71, compared to White ethnicity), ascites (aHR 3.15, 95CI 1.30-7.67), cirrhosis (aHR 6.55, 95CI 4.57-9.38) and peptic ulcer disease (aHR 2.26, 95CI 1.45-3.51). Conclusions Targeting interventions and HCC surveillance at vulnerable groups is essential to improve CHB outcomes, and to support progress towards international goals for the elimination of hepatitis infection as a public health threat.


Introductory Statement
The 2022 Global Burden of Disease analysis of hepatitis B virus (HBV) epidemiology estimates that >300 million people live with chronic infection worldwide (1).Through its progression to cirrhosis and primary liver cancer (hepatocellular carcinoma (HCC)), chronic hepatitis B (CHB) is the leading global cause of HCC death (2), and the third leading cause of death amongst people with cirrhosis (1).Over recent decades, age-standardised death rates have remained constant or increased for HCC and cirrhosis, respectively, and HBV-attributable deaths have increased worldwide (3).International targets calling for the elimination of HBV infection as a public health threat by the year 2030 have been set (4), with recent recognition and investment into the early detection and treatment of HCC (5).Meeting elimination targets requires clear understanding of the epidemiology of infection and associated liver disease in order to target resources and interventions to high-risk groups and benchmark progress.CHB prevalence has not been robustly estimated in many settings, including the UK (6), and groups at the highest risk of morbidity and mortality have not been well characterised.Furthermore, even in well-defined CHB populations, treatment coverage and eligibility are often unreported.Regional HBV reports from UK public health services (UK Health Security Agency, previously Public Health England) have included neither overall estimates of the proportion of CHB individuals receiving antiviral treatment, nor estimates stratified by relevant subgroups other than age, sex and ethnicity (6-10).
There has been increasing interest in identifying risk factors for CHB progression to cirrhosis, HCC and other endpoints (11).Age, sex, HBV DNA viral load (VL) and viral genotype are established determinants of HCC risk (12)(13)(14)(15)(16)(17)(18), and recent studies have reported associations between HCC and various comorbidities, including type 2 diabetes mellitus (T2DM) and hypertension (11).However, few cohorts have been characterised in European countries and/or in ethnically diverse populations, to validate or inform scoring approaches.
Studies based on electronic health records (EHRs) enable characterisation of large retrospective cohorts, thus enhancing statistical power, and identifying a study sample that is more representative of the whole disease population compared to clinical trials.Such databases often have longitudinal follow-up, with exposures and outcomes ascertained over time.EHR databases can often be linked to other registries (such as national cancer registries and vital statistics), allowing for identification of relevant endpoints.
Given the substantial evidence gaps concerning HBV epidemiology, disease burden and risk factors for progression to HCC, we set out to identify a cohort from a large-scale primary care EHR database in England (19) with two aims (i) to characterise the CHB population and (ii) to investigate risk factors for progression to HCC.

Data source and study population/design
We used data from the England primary care database QResearch (version 45), which contains >35 million patient records from >1800 individual practices (20).QResearch was established in 2002 and contains anonymised individual-level patient EHRs.Data are collected prospectively and are linked to hospital episode statistics (HES), National Cancer Registration Analysis Service (NCRAS) and Office for National Statistics (ONS) mortality data.
We identified individuals from the QResearch database who (at any time between 01 January 1999 and 31 December 2019) were age ≥18 years and had a record of CHB based on either: diagnostic Systemised Nomenclature of Medicine (SNOMED)/Read or International Classification of Disease (ICD) code indicating CHB (21,22), or (ii) presence of detectable hepatitis B surface antigen (HBsAg) or HBV DNA (VL) measurement on at least two recordings ≥ 6 months apart (Supplementary Figure 1).

Covariate selection and ascertainment
We identified relevant covariates for extraction a priori (protocol submitted to QResearch) based on previous literature (11,(23)(24)(25)(26)(27)(28)(29)(30) and clinical relevance.Ethnicity is categorised in QResearch as per 2011 census categories (31).Baseline patient-level Townsend Deprivation quintile is available as a measure of socioeconomic status in QResearch electronic records, and is a multifactorial measure of deprivation which accounts for employment, home and car ownership and domestic overcrowding.
We characterised lifestyle factors, demographics and relevant numeric biomarkers from relevant SNOMED/Read codes.We collected comorbiditiy data from relevant SNOMED/ Read and ICD-9 and -10 codes and amalgamated subtypes of comorbid cardiovascular diseases (including ischaemic heart disease, hypertension and cerebrovascular disease) into a single variable to improve model fit.Body mass index (BMI, kg/m 2 ) was categorised (underweight, <18.5 kg/m 2 ; healthy weight, 18.5-2.49kg/m 2 ; overweight, 25.0-29.9kg/m 2 ; obese, ≥ 30 kg/m 2 ) based on World Health Organization (WHO) categories (32).Covariate measurements made within ±3 years of earliest CHB diagnosis and before HCC diagnosis were used as proxy baseline measurements.Where patients had >1 measurement taken within 3 years of the earliest CHB diagnosis, we used measurements taken closest to diagnosis date.

Outcome ascertainment
Our primary endpoint of interest was HCC, which we ascertained via identification of patients with relevant SNOMED/Read or ICD codes corresponding to HCC, and by linkage of the cohort to National Cancer Registry data (33,34).In order to maximise outcome ascertainment, we used a broad definition for HCC including multiple relevant codes (Supplementary Table 1).We performed sensitivity analysis (further details below) whereby all patients with non-HCC neoplasms were excluded, to investigate robustness of main analysis using our broad HCC definition.A tabulation of HCC cases across source of diagnosis is presented in Supplementary Table 2.

Follow-up
Earliest date of CHB diagnosis was regarded as cohort entry and initiation of follow-up for each individual.For patients who developed HCC, date of HCC diagnosis was regarded as the end of follow-up.For patients who did not develop HCC (i.e., patients who were censored), follow-up ended at patient cohort exit date (either due to leaving their general practice and switching to a practice which does not contribute to QResearch, or death) or 31 December 2019, whichever occurred earlier.Patients in whom database exit date preceded or was equal to first recorded CHB diagnosis date (n = 1010) whereby follow-up time ≤0 years were excluded from longitudinal analysis.
In some patients, HCC diagnosis date or cohort exit date preceded or was equal to CHB diagnosis date (Supplementary Table 3).Data from these patients were excluded from analyses of HCC risk factors.

Statistical analysis
Statistical analyses were carried out in R (version 4.1.0).Baseline characteristics were summarised for all CHB patients (regardless of length of follow-up) using descriptive statistics.Means and standard deviations (SDs) or medians and interquartile ranges (IQRs) were presented for continuous measures, and were compared using t or Wilcoxon rank-sum tests, respectively.Patient counts and percentages were presented for categorical and binary variables, and were compared using chi-squared or Fisher's exact tests.
We used univariable and multivariable Cox proportional hazards models to investigate risk factors for progression of CHB to HCC, including variables in the multivariable model based on significance of univariable associations (where P ≤0.1) and/or based on biological/clinical relevance and previous literature (11,(23)(24)(25)(26)(27)(28)(29)(30).A previous meta-analysis we undertook to investigate risk factors for HCC in CHB was also used to inform variable selection (11).Satisfaction of the proportional hazards assumption was assessed by visualisation of log-log Kaplan Meier survival estimates curves.Where the assumption was violated, time-varying covariates were fitted.

Europe PMC Funders Author Manuscripts
Europe PMC Funders Author Manuscripts Continuous laboratory parameters which were right-skewed were transformed with a natural logarithm for inclusion in multivariable models.Laboratory parameters were divided into quintiles for inclusion in multivariable models.Means and SDs for log AST, log alanine transaminase (ALT) and platelet count (Plt) quintiles are presented in supplement (Supplementary Table 4).Hazard ratios and 95% confidence intervals (95% CI) were reported for Cox proportional hazards model outputs.Analysis on the imputed dataset was used for main models.

Handling of missing data
Values were missing for Townsend Deprivation Quintile, ethnicity, alcohol consumption, cigarette consumption, BMI, Plt, ALT measurement, aspartate transaminase (AST) measurement, Hepatitis B surface antigen (HBsAg), and HBV viral load (VL).Missing data are described further in Supplementary Table 5.
Multiple imputation by chained equations (MICE) was used to impute missing data across patient characteristics.The assumption of missing at random was made for imputed variables.This is in accordance with previous handling of missing data in cohorts utilising QResearch data (35)(36)(37)(38), and current recommendations for imputation of missing data (39).Characteristics with >90% missingness were not imputed.Ten imputed datasets were generated, and results from univariable and multivariable Cox proportional hazards models from each dataset were pooled according to Rubin's rules (40,41).

Sensitivity analyses
To test robustness of our main model, we performed three sensitivity analyses (Supplementary Table 6), as follows (i) main results model fit to complete-case cohort subset (i.e. the subset of patients with completeness for all variables); (ii) exclusion of patients with history of non-HCC neoplasms (presented in Supplementary Table 7) to control for unmeasured outcome misclassification whereby secondary liver cancer has been misclassified as primary HCC; (iii) addition of ALT, AST and Plt in the main model fit to the imputed dataset, as the percentage of missingness in these exposures was too high for them to be included in main analysis.
In order to further investigate the association of antiviral therapy with HCC risk, propensity score analysis was undertaken as an additional sensitivity analysis.Specifically a propensity score for initiating antiviral treatment was generated by regressing the odds of treatment initiation onto the following predictors of treatment initiation: age, sex, socioeconomic status, ethnicity, BMI, T2DM, alcohol-related liver disease, cirrhosis, end-stage liver disease and non-alcoholic fatty liver disease.Accordingly two models were fitted to investigate how the association of antiviral therapy with HCC risk changed before and after the addition of the propensity score to the model, in order to determine whether the association of antiviral therapy initiation is confounded by factors associated with treatment initiation.

Europe PMC Funders Author Manuscripts
Europe PMC Funders Author Manuscripts

Results
Individuals with CHB are under-represented in primary care records, with the majority of patients with CHB being male, of non-white ethnicity, socioeconomically deprived and untreated We identified 8039 individuals living with CHB in the QResearch database between 1999 and 2019 from a database-wide denominator of ~35 million individuals, translating to a prevalence of diagnosed CHB of 0.023%.Most of these were identified by a SNOMED/ Read or ICD CHB diagnostic code (7856/8039, 97.7%), with a remaining 2.3% (252/8039) identified by HBsAg and/or VL measurements (in the absence of a diagnostic code) (Table 1).Median follow-up duration was 3.87 years (IQR 6.30 years), with differential follow-up between individuals who developed HCC (median follow-up 1.47 years, IQR 5.13 years) and those who did not (median follow-up duration 3.93 years, IQR 6.28 years).Mean age at baseline was 38.3 years (SD 11.6 years), and at baseline >75% were ≤45 years of age.The majority were male (4856/8039, 60.4%).
Black African ethnicity represented 25.4% of individuals, with 12.9% of Chinese ethnicity, 5.9% of Pakistani ethnicity, 3.0% of Indian ethnicity and 28.4% of White ethnicity (Table 1).Proportions of Black and ethnic minorities in our CHB cohort were greater than those in both the wider QResearch database (42) and general English population (43) (Figures 1,2).Data for smoking consumption, alcohol consumption and BMI were available for 71.5%, 56.1% and 61.3% of the cohort (Table 1).At baseline, most individuals (88.0%) had no record of antiviral treatment.Within 1, 1-2, 2-3 and ≥ 4 years of CHB diagnoses, cumulatively 2.7%, 4.0 %, 5.0% and 10.1% of patients had record of antiviral treatment initiation from baseline, respectively.
Age and sex were significantly associated with Townsend deprivation quintile, with more deprived quintiles characterised by younger mean age (P < 0.001) and higher proportions of males (P = 0.021) (Table 2).Ethnicity differed across quintiles (P < 0.001) whereby the proportions of Bangladeshi and Black African ethnicity patients increased with increasing deprivation, and proportions of White and Chinese patients decreased.Alcohol and cigarette consumption were also associated with deprivation quintile (P < 0.001 and P = 0.04, respectively), but no obvious trends were apparent across quintiles.No associations of antiviral treatment, antidiabetic drug, antihypertensive, NSAID and statin use with deprivation quintile were observed.

Prevalence of diabetes and hypertension in adults with CHB was higher than in the general population
Baseline prevalence of T2DM and hypertension were 8.9% and 15.3%, respectively, differing from prevalences of <8% and <3%, respectively, that have been reported in the wider QResearch cohort representing the UK population (44).Prevalence of other comorbidities (including congestive heart failure, chronic kidney disease, alcohol-related liver disease, ascites, autoimmune hepatitis, cerebrovascular disease, end-stage liver disease, ischaemic heart disease, no-alcoholic fatty liver disease and peptic ulcer disease; Table 1) ranged from 0.1% to 5%.A minority (8.6%) of the cohort had a diagnostic code indicating cirrhosis.Frequency of medication use was as follows: antidiabetic drugs (9.2%), antihypertensives (5.1%), non-steroidal anti-inflammatory drugs (NSAIDs) (5.9%) and statins (5.7%).Prevalence of non-HCC neoplasm was 4.7% in the overall cohort.

Risk factors for HCC included male sex, older age, increased deprivation, Caribbean ethnicity and peptic ulcer disease
Baseline characteristics of the imputed dataset used in analysis of HCC risk factors are presented in Supplementary Table 8.Multivariable Cox proportional hazards models were constructed for 7029 patients in whom 161 HCC cases developed throughout 41,147 personyears of follow-up (Figure 3, Table 3).This translated to an HCC incidence rate of 5.10 cases per 1000 person-years (95% Confidence Interval (95CI) 4.46 to 5.84).
Hazards of HCC were increased in males (adjusted hazards ratio (aHR) 3.17, 95% CI No medicines, including antiviral treatment, associated with hazards of HCC.however it is important to note that statin use was associated with reduced hazards of HCC, although confidence intervals for this association crossed the null.Kaplan Meier curves for the associations of ascites, cirrhosis and peptic ulcer disease are available in Supplementary Figure 2.
Hazards ratios did not change materially in strength or direction upon sensitivity analysis excluding non-HCC neoplasms or including AST, ALT and Plt at baseline (Table 3).

Interrogation of association of antiviral therapy with hazards of HCC
To interrogate the association of antiviral therapy with increased hazards of HCC in our main model results (Table 3), we undertook sensitivity analysis whereby a propensity score for initiating antiviral treatment was generated.We fitted two models to investigate how the association of antiviral therapy with HCC risk changed before and after the addition of the propensity score (Supplementary Table 9).Before addition of the propensity score to the model, antiviral therapy associated with an increased HCC risk, likely reflecting shared factors associating with both antiviral treatment initiation and increased HCC risk such as increasing age, male sex and comorbid liver disease.Following addition of the propensity score to the model, this association of antiviral therapy with increased HCC risk was attenuated towards the null and 95% CI crossed 1.00.

Main model results are robust to complete-case sensitivity analysis
We undertook sensitivity analysis restricted to the subgroup of patients for whom complete data were available (n=3648 patients in whom 68 cases of HCC occurred (Supplementary Table 10)).Hazard ratios did not change materially in strength or direction in sensitivity analysis undertaken to exclude patients with history of non-HCC neoplasms (Supplementary Table 10).

Summary of key findings
This is the largest population of individuals living with HBV characterised in England to date, from either EHR or traditional prospective cohorts.Our CHB group was ethnically diverse, with higher proportions of black and ethnic minority individuals than the total QResearch database (42) or general English population (43).The CHB cohort is disproportionately socioeconomically deprived, with substantial burdens of comorbid disease.We identified increased hazards of HCC associated with increasing age, male sex, socioeconomic deprivation, Caribbean ethnicity, severe liver disease (ascites and cirrhosis), and comorbid disease (peptic ulcer disease).We report a protective association of statin use with HCC risk.Although antiviral treatment is known to moderate HCC risk (45)(46)(47), this association was not identified in this dataset.Age, sex and T2DM have previously been found to positively associate with HCC risk in CHB (11), however this is the first study to confirm these associations in an ethnically diverse cohort.The QResearch database has geographic coverage across England, therefore findings should be generalizable across the country, but thorough representation is precluded as many individuals living with CHB are either undiagnosed or not represented in primary care EHR.

Influence of HBV genotype
HBV genotype is not routinely determined in clinical practice and therefore not available in EHRs.Viral genotypes associate with ethnicity (48,49), increased HCC risks (50) and antiviral treatment resistance (51).Therefore, associations of HCC risk with ethnicity (and thereby socioeconomic deprivation) may be confounded by genotype or mediated by unmeasured population genetic or lifestyle factors.

Drug treatment and HCC development
A protective association of statin use with HCC hazards has been reported in previous CHB cohorts (11,52,53), in individuals with predisposing HCC risk factors including cirrhosis and T2DM (54,55) and a general patient population (53)(54)(55).However, this association may also be confounded by health-seeking behaviour and/or healthcare engagement whereby unmeasured lifestyle factors or healthcare interventions associate with both statin use and reduced HCC risk.Further analysis, including mediation analysis where data allows, is warranted to investigate potential mechanisms.Similarly, the positive association of peptic ulcer disease with hazards of HCC may be confounded by proton pump inhibitor (PPI) administration for peptic ulcer treatment.PPI prescription/usage was not available in our sample, but previous observational studies report increased risks of HCC with PPIs (56,57).

Europe PMC Funders Author Manuscripts
Europe PMC Funders Author Manuscripts Pooled risk estimates from meta-analyses are variable (58)(59)(60).It is possible that this association can be attributed to ascertainment bias and is confounded by cirrhosis, whereby cirrhotic patients are more likely to undergo surveillance endoscopy and thereby have peptic ulcer disease detected more frequently than non-cirrhotic patients.
Unlike previous observational and randomised interventional studies providing evidence that treatment with nucleoside analogues (NAs) reduces HCC risk (45-47), we do not report this association.Throughout our study, only 10.3% of individuals were documented to have initiated treatment.However, treatment data may be missing from primary care records, as HBV prescribing is based in secondary/tertiary care (61), or a low proportion of our primary care population may be treatment eligible.It is also feasible that our follow-up periods are too short for us to be able to identify a signal for the protective effects of antiviral therapy, which is a long-term intervention.

Limitations of primary care EHR analysis -missing data
We report substantial data missingness, due to poor primary care access/coverage HBV.The crude prevalence estimate for HBV in this primary care dataset (0.023%) underestimates population prevalence by at least an order of magnitude, as recent estimates for prevalence of HBV in the UK suggest a prevalence of approximately 0.5% (64).Missingness is likely associated with unmeasured characteristics, therefore marginalised subgroups (including undocumented migrants, highly mobile population subgroups, and people who do not speak English) may be under-represented.
Most patients were identified by coding, with a minority (3.0%) having confirmatory laboratory tests accessible in QResearch.This is logical as specialist referral is recommended following HBsAg positivity (65) and second confirmatory tests are performed in secondary/tertiary care.Poor linkage between primary and secondary care health data is currently a missed opportunity for high quality clinical service provision, and for translational health research.Enhanced data linkage between primary and secondary care would provide direct benefits for overall patient management with diverse clinical and public health benefits.For HBV infection specifically, such linkage would improve the quality of national data, assist with screening and prevention interventions, enhance linkage to services and continuity of care, and provide opportunities for improved early diagnosis of liver complications (including HCC).
Our median follow-up is relatively short for a chronic disease, despite a 20-year study period.However time lags between notification of patient characteristic/disease and input into electronic systems is common in EHR databases.Additionally it is likely that Europe PMC Funders Author Manuscripts Europe PMC Funders Author Manuscripts individuals living with HBV infection present to primary care late in infection course and/or notification of secondary/tertiary care infection management is not linked to primary care EHR.Differential follow-up between individuals with and without HCC may be due to late HCC presentation with advanced symptomatic liver disease.
Improved characterisation of relevant variables which may influence HCC risk in primary care EHR would improve quality of future analyses.For example, robust identification of cirrhosis would enable stratification according to this disease subgroup, and would allow for interrogation of underlying disease mechanisms for other risk factors.At present, cirrhosis is poorly coded due to heterogeneity of disease phenptype, diverse underlying aetiology, varied approaches to investigation, and lack of robust and universal case definitions.In addition, further research into potential nutritional HCC risk factors, including the effect of underweight status, obesity and alcohol consumption, would provide additional insights into preventive strategies and patient management.This is especially relevant to the HBV population as nutritional habits may associate with socioeconomic characteristics.

Analytical corrections for missing data
Imputation of missing baseline data was undertaken in line with previous QResearch investigations (35)(36)(37)(38).We were unable to impute HBV biomarkers (including VL and HBsAg) as >90% of participants missed baseline measurement.We therefore excluded these variables from analysis.High missingness was observed for relevant biomarkers (AST, ALT and Plt) indicative of liver health, which are used to score fibrosis and cirrhosis stage and we therefore could not validate/modify HCC risk scores.Analysis in more complete secondary care datasets is warranted to determine the utility of laboratory parameters as robust predictors of disease endpoints and estimate effect sizes.Missingness of biopsy, imaging, or laboratory scores in primary care EHR (66) limited cirrhosis identification, thereby underestimating the prevalence in the cohort and preventing robust investigation of cirrhosis as an endpoint.Similarly prevalence of other comorbidities are likely underestimated, and we are underpowered to detect associations with HCC risk.
Many participants are missing alcohol and cigarette consumption data, and complete measurements may systematically underestimate intake based on self-reporting bias.
Although consumption may associate with HCC risk, we cannot report this association in our study.We were also unable to time-update models for changes in alcohol and cigarette consumption throughout follow-up due to lack of repeated measurements (>90% of individuals had one-off records).

Impact and recommendations
Our results demonstrate that the burden of CHB in the UK is concentrated in a young, ethnically diverse and socioeconomically deprived disease population.Improving access to clinical services, routine HCC surveillance and screening coverage, and representation in large-scale national datasets is necessary.This is warranted not only to improve patient outcomes and reduce the attributable disease burden, but also to obtain more representative data from which more mechanistic and causal inference insights may be gleaned in order to

Europe PMC Funders Author Manuscripts
Europe PMC Funders Author Manuscripts identify specific opportunities for intervention along the patient pathway.Improved linkage of primary and secondary care EHR datasets is essential to achieve these goals.

Conclusions
The CHB population in England is ethnically diverse and socioeconomically deprived.We identified risk factors for HCC, and validated associations observed in previous CHB cohorts.Missingness limits identification of CHB individuals and robust description of those identified.Improved data capture by EHR systems, and enhanced communication between primary and secondary care records, is crucial to provide an evidence base for interventions, including diagnostic screening, treatment and surveillance, modification of risk factors for HCC, and monitoring progress towards elimination targets.CHB-QR, individuals with chronic hepatitis b in the QResearch primary care database.

Europe PMC Funders Author Manuscripts
Europe PMC Funders Author Manuscripts Forest plot for Cox proportional hazards model to identify risks of hepatocellular carcinoma (HCC) in an adult population with chronic Hepatitis B virus infection derived from the QResearch primary care database.Analysed using a dataset generated by multiple imputation with chained equations (n = 7029, HCC cases = 161).First (least deprived) to fifth (most deprived) Townsend Deprivation Quintiles are denoted by SES1-5, respectively.'Init.' refers to treatment initiation with antiviral therapy.
HCC risk stratificationHCC risk scores (including PAGE-B (24), REACH-B (25,62), GAG-HCC(26,27) and CU-HCC (28-30)) incorporate various characteristics, including age, sex, and laboratory parameters to predict HCC risk.The utility of existing risk scores in homogenous patient subgroups has been demonstrated(63).Future analyses are required to validate (and potentially modify) scores in heterogenous ethnically and clinically diverse samples to inform interventions.
Financial support PCM is funded by Wellcome (grant ref 110110/Z/15/Z), UCLH NIHR Biomedical Research Centre and the Francis Crick Institute.CC's doctoral project is jointly funded by the Nuffield Department of Medicine, University of Oxford and by GlaxoSmithKline.EB is funded by the Oxford NIHR Biomedical Research Centre and is an NIHR Senior Investigator.The views expressed in this article are those of the author and not necessarily those of the NHS or the NIHR.PCM, EB and CC are supported by the DeLIVER program "The Early Detection of Hepatocellular Liver Cancer" project is funded by Cancer Research UK (Early Detection Programme Award, grant reference: C30358/A29725).EB, TW, CC and PCM acknowledge support from the NIHR Health Informatics Collaborative.

Figure 1 .
Figure 1.Townsend deprivation quintile breakdown in 8039 adults with chronic hepatitis B virus infection derived from the QResearch primary care database (England) versus the United Kingdom general population.

Figure 2 .
Figure 2. Ethnicity breakdown in 8039 adults in a chronic hepatitis B virus cohort characterised from the QResearch primary care database (England) versus all individuals in the QResearch database (~35 million), vs. the United Kingdom general population.General population estimates obtained from 2019 estimates from the Office for National Statistics.CHB-QR, individuals with chronic hepatitis b in the QResearch primary care database.