Trends and levels of the global, regional, and national burden of appendicitis between 1990 and 2021: findings from the Global Burden of Disease Study 2021

Summary Background Appendicitis is a common surgical emergency that poses a large clinical and economic burden. Understanding the global burden of appendicitis is crucial for evaluating unmet needs and implementing and scaling up intervention services to reduce adverse health outcomes. This study aims to provide a comprehensive assessment of the global, regional, and national burden of appendicitis, by age and sex, from 1990 to 2021. Methods Vital registration and verbal autopsy data, the Cause of Death Ensemble model (CODEm), and demographic estimates from the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) were used to estimate cause-specific mortality rates (CSMRs) for appendicitis. Incidence data were extracted from insurance claims and inpatient discharge sources and analysed with disease modelling meta-regression, version 2.1 (DisMod-MR 2.1). Years of life lost (YLLs) were estimated by combining death counts with standard life expectancy at the age of death. Years lived with disability (YLDs) were estimated by multiplying incidence estimates by an average disease duration of 2 weeks and a disability weight for abdominal pain. YLLs and YLDs were summed to estimate disability-adjusted life-years (DALYs). Findings In 2021, the global age-standardised mortality rate of appendicitis was 0·358 (95% uncertainty interval [UI] 0·311–0·414) per 100 000. Mortality rates ranged from 1·01 (0·895–1·13) per 100 000 in central Latin America to 0·054 (0·0464–0·0617) per 100 000 in high-income Asia Pacific. The global age-standardised incidence rate of appendicitis in 2021 was 214 (174–274) per 100 000, corresponding to 17 million (13·8–21·6) new cases. The incidence rate was the highest in high-income Asia Pacific, at 364 (286–475) per 100 000 and the lowest in western sub-Saharan Africa, at 81·4 (63·9–109) per 100 000. The global age-standardised rates of mortality, incidence, YLLs, YLDs, and DALYs due to appendicitis decreased steadily between 1990 and 2021, with the largest reduction in mortality and YLL rates. The global annualised rate of decline in the DALY rate was greatest in children younger than the age of 10 years. Although mortality rates due to appendicitis decreased in all regions, there were large regional variations in the temporal trend in incidence. Although the global age-standardised incidence rate of appendicitis has steadily decreased between 1990 and 2021, almost half of GBD regions saw an increase of greater than 10% in their age-standardised incidence rates. Interpretation Slow but promising progress has been observed in reducing the overall burden of appendicitis in all regions. However, there are important geographical variations in appendicitis incidence and mortality, and the relationship between these measures suggests that many people still do not have access to quality health care. As the incidence of appendicitis is rising in many parts of the world, countries should prepare their health-care infrastructure for timely, high-quality diagnosis and treatment. Given the risk that improved diagnosis may counterintuitively drive apparent rising trends in incidence, these efforts should be coupled with improved data collection, which will also be crucial for understanding trends and developing targeted interventions. Funding Bill and Melinda Gates Foundation.

Appendix 1: Supplementary methods and results to "The global, regional, and national burden of appendicitis between 1990 and 2021: a systematic analysis for the Global Burden of Disease Study 2021"

Section 1. Statement of GATHER compliance
This study complies with the Guideline for Accurate and Transparent Health Estimates Reporting (GATHER) recommendations.See table S1 below for the GATHER checklist.The GATHER recommendations can be found on the GATHER website (https://www.who.int/data/gather).Provide all data inputs in a file format from which data can be efficiently extracted (e.g., a spreadsheet rather than a PDF), including all relevant metadata listed in item 5.For any data inputs that cannot be shared because of ethical or legal reasons, such as third-party ownership, provide a contact name or the name of the institution that retains the right to the data.Per GBD standards, age-standardisation was performed using the direct method, as described by Ahmad and colleagues for WHO in 2001. 1 That is to say, we aggregated year-age-sex-location-specific estimates to the age-sex distribution of a reference population, specifically, the GBD world population age standard.This world population age standard uses the non-weighted mean of GBD 2021's age-specific population proportional distributions for all national locations with a population greater than 5 million people in 2019 (non-pandemic year).The population estimates used for GBD 2021 will be reported in GBD 2021 Demographics Collaborators.Global age-sex-specific mortality, life expectancy, and population estimates in 204 countries and 811 subnational locations, 1950-2021: a comprehensive demographic analysis for the Global Burden of Disease Study 2021.Lancet (accepted), 2 but methods are broadly similar to GBD 2019. 3ction 4. Fatal estimation

Flowchart
The flowchart below describes the mortality and YLL estimation processes used for appendicitis in the Global Burden of Disease (GBD) study.

Data identification and processing
We used vital registration and verbal autopsy data to model appendicitis mortality.Specifically, causes of death (CoD) data were extracted from 870 sources covering 134 countries and territories, as shown in the coverage map in Figure S2.The International Classification of Diseases (ICD) codes mapped to appendicitis are shown in Table S3.
A detailed description of processing of CoD data for GBD has been published previously. 4,5It follows the standard processing steps that are applicable to all GBD-defined causes, including ICD mapping, disaggregation of aggregated data, age-sex splitting, corrections for misclassification, redistribution of garbage codes, and noise reduction.Outliers were identified by systematic examination of datapoints for all location-years.Data were excluded if they violated well-established age or time trends, and data in instances where garbage code redistribution and noise reduction, in combination with small sample sizes, resulted in unreasonable cause fractions.Methods for assigning outlier status were consistent across both vital registration and verbal autopsy data.

Modelling
To estimate appendicitis mortality, a standard causes of death ensemble model (CODEm) with locationlevel covariates was used. 4,6CODEm, which uses an ensemble modelling method that involves generation and validation of submodels using the train-test 1-test 2 approach, weighting and testing of the model performance to select the best ensemble model with the highest out-of-sample predictive validity, has originally been described in a peer-reviewed publication by Foreman et al. 6 The updates to the method have been described in a series of publications reporting GBD cause of death results. 4,7-10 As described in greater detail in Foreman et al, CODEm modelling starts with generation of a diverse family of potential submodels (i.e., component models) in four families: linear mixed effects regression (LMER) models of the natural log of the cause-specific death rate, LMER models of the logit of the cause fraction, spatiotemporal Gaussian process regression (ST-GPR) models of the natural logarithm of the cause-specific death rate, and ST-GPR models of the logit of the cause fraction.In any given application, the analyst specifies all plausible covariates associated with a particular cause of death.For each covariate identified, the analyst specifies the expected direction (positive or negative) of its relationship to cause of death based on scientific evidence and classifies it into one of the three levels (Levels 1, 2, and 3) based on the plausibility and strength of evidence for causal association.The level of each indicator represents the level of association between the covariate and appendicitis mortality.It ranges from 1, indicating a strong biological link to outcome, to 3, indicating weak or unknown relationship to outcome.The direction of each indicator represents the expected direction of relationship to cause of death.
For appendicitis, we included six predictive covariates, as shown in Table S4, which were modelled and estimated as part of GBD.Details of each covariate's input data and modelling strategies are described elsewhere.The ST-GPR models begin with a linear prediction, per above, and then smooth over space, time, and age based on weighted residuals of adjacent data and then apply Gaussian process regression.
The process of covariate selection proceeds by level.We test submodels corresponding to all 2n -1 combinations of level 1 covariates, where n is the number of covariates in level 1.All submodels where the coefficients for all covariates have the expected sign and are significant at the p <0.05 level are retained.For each level 1 model that was retained, we create a list of 2m possible level 2 models (where m is the number of level 2 covariates).The first model, which has no level 2 covariates included, has already been tested and is retained.Next, each of the m possible models in which one covariate is added to the level 1 model is tested.If adding the level 2 covariate does not affect either the significance or the sign on any level 1 coefficients, and the level 2 covariate itself meets the criteria of direction and significance, then it is retained as another possible submodel.If the level 2 covariate does not fulfill the criteria or forces any of the level 1 covariates to violate their criteria, then the submodel is dropped; all other possible level 2 submodels that contain that covariate are also dropped.Next, we take each of the models resulting from the level 2 process and use the same process as described for level 2 on the level 3 covariates.Ultimately, we obtain a set of all possible covariate combinations that fulfill our expectations for covariate direction.We run the covariate selection for submodels using both logit cause fraction and natural log of cause-specific death rate and then create both LMER-only and ST-GPR models for each set of chosen covariates.
Submodels are fit on 70% of CoD data, with 30% holdouts of the data selected to mimic observed patterns of missingness for age groups, years, and locations.Out-of-sample predictive validity of each submodel is then tested using half of the excluded data (15% of the total).Performance tests include the root-mean-squared-error (RMSE) for the log of the cause-specific death rate, the direction of the predicted versus actual trend in the data, and the coverage of the predicted 95% uncertainty interval.The process of fitting submodels and calculating out-of-sample performance is repeated 20 times.Submodels are ranked according to their performance on these out-of-sample performance tests across the 20 repetitions.Submodels are then weighted to determine their contribution to the ensemble estimate.The relative weights are determined both by the submodel performance ranks and by a parameter ψ, whose value determines how quickly the weights taper off as rank decreases.Specifically, the relationship between weight and parameter ψ is specified using a monotonically decreasing function, given by, The analyst may specify the range of ψ values (usually between 1 and 1.2, as is true for the appendicitis model) to test in order to identify the best ψ value based in the cross-validation.
A set of ensemble models is then created by using the weights constructed from the combinations of ranks and ψ values.These ensembles are tested by using the predictive validity metrics described above on the remaining 15% of the data, and the ensemble with the best performance in out-of-sample trend and RMSE is chosen as the final model.
There are several model parameters that the analyst can set in CODEm, including the size of the data hold-outs, the maximum and minimum values for ψ, the linear floor rate (the lowest death rate per 100,000 in the linear predictor), as well as lambda, zeta, omega and GPR parameters.For appendicitis, we used the standard default settings for GBD. 10 The entire ensemble modelling approach is conducted twice for GBD: once to fit a global CODEm model that uses data from, and makes predictions for, all locations in the GBD estimation framework; and once to fit a data-rich CODEm model, which uses data from, and makes predictions for, only the locations considered to be data-rich.The designation of data-rich locations is based on a "star" rating system (0-5 stars) to rate the quality of data for any given location and year, with 5 being the best and 0 being the worst.The inputs that determine this star rating are the percentage of total deaths determined to be garbage coded (such as "All, Ill-defined"), the percentage of deaths determined to be an aggregated cause, and the level of completeness in the dataset.More detailed information about the causes of death data star rating calculation can be found on pages 45-48 of the Supplementary document of the GBD 2019 publication. 4 assumed that children under 1 year of age do not die from appendicitis; we kept the age restrictions for mortality estimation of appendicitis of 12 months for lower bound and 95+ years for upper bound.Separate models were conducted for male and female mortality in addition to the global and data-rich locations as mentioned above.We hybridised separate sex-specific global and data-rich models to acquire unadjusted results, which we adjusted using the cause of death correction (CoDCorrect) procedure 4 and compared to the reference life table to calculate final YLLs due to appendicitis.

Summary statistical metrics on model performance
The table below shows the overall predictive performance of sex-specific global and data-rich models of appendicitis.Section 5. Non-fatal estimation

Flowchart
The flowchart below describes the incidence, YLD, and DALY estimation processes used for appendicitis in the Global Burden of Disease (GBD) study.

Incidence data
We used medical claims and hospital discharge data to model appendicitis incidence.Medical claims data come from three locations: USA, Poland, and Taiwan (province of China).The USA claims data were extracted from the Truven database of USA private health insurance, which contains more than 12 billion claims in 2000, and 2010-2017.5][16] Hospital inpatient discharge data were extracted from 95 different sources in 50 countries, each covering one to 26 years of data from 1988. Figure S4 shows the data coverage map of these input sources.

Figure S4. Data coverage for estimating incidence of appendicitis
The biggest advantage of the three claims data sources we have is being able to link claims for all inpatient and outpatient encounters for a single individual using the unique identifier IDs (de-identified), regardless of whether appendicitis is coded as primary or secondary diagnosis.For appendicitis specifically, to capture cases that were diagnosed and/or treated in both inpatient and outpatient settings, an individual was extracted from claims data as an incident case if that individual had at least one inpatient or outpatient encounter with an appropriate ICD code related to appendicitis (Table S5) as any diagnosis within 28 days.The population sample from which those cases arose was considered to comprise all individuals enrolled in that insurance plan that calendar year.
Hospital discharge data provide observations about encounters, generally with only the primary diagnostic code for the encounter.For these data sources, we applied the inclusion criteria that an admission is an overnight stay of 24 hours or more; discharges with a length of stay (LOS) less than 24 hours are excluded from all inpatient sources wherever possible.This was done for databases known to include "hospital outpatient" and "observation" admissions, because several hospital databases in our dataset are known or suspected not to include these shorter encounters, and we wanted to construct like databases for comparison.For Truven's MarketScan database, we do not apply the same filtering for LOS <1 as in inpatient data sources because the MarketScan claims data are already organised in a way that separates hospital outpatient data from inpatient admissions -the hospital outpatient claims in the MarketScan database are included in the "outpatient services" file and are used for outpatient claims estimates.In addition, in the MarketScan database, a claim must also meet certain inclusion criteria to be considered an inpatient admission, such as having a room and board claim.If these criteria are not met, the records are stored in the Outpatient Services Table and no admission record is created.
Hospital inpatient discharge data were adjusted to account for diagnosis and treatment across all admission types and levels of care through multiple data processing steps.Specifically, for each source, we calculated the fraction of all hospital admissions primarily due to appendicitis for a given year, age group, and sex.This fraction was then multiplied by the annual hospital utilisation rate for that specific year, age group, and sex to estimate the annual rates of primary appendicitis admission.To calculate the annual population estimates of appendicitis cases for a given year, age group, and sex, these annual rates of primary admission of appendicitis were adjusted using correction factors that were modelled from the ratio of inpatient discharge with appendicitis as primary diagnosis to all cases ascertained in any health care encounter (inpatient versus outpatient claims data, primary versus secondary diagnosis), using MarketScan claims data as a detailed source of data for all health care encounters for an individual for a year, and then applying that ratio to all inpatient discharge data for all locations.This makes the assumption that in other populations, those cases that either receive care outside of an inpatient encounter (with more than 24 hours of overnight stay) or never receive care at all (and thus are missing from our inpatient data for those populations) occur in approximately the same ratio to the inpatient encounters as in the MarketScan database.Thus, the ratio inflates the inpatient discharge data to account for those missing cases.Estimating the correction factors is done using a non-linear mixed-effects meta-regression tool called meta-regression-Bayesian, regularised, trimmed (MR-BRT).The details about the MR-BRT tool have been published previously. 4,17Briefly, the MR-BRT program is a set of wrappers customised for global health problems that use the open-source mixed effects package limeTr (https://github.com/zhengp0/limetr). Specifically, the logs of these ratios were calculated and used as an input to MR-BRT using covariates for age and sex.The equation used for correction factor estimation is shown below (correction to account for inpatient and outpatient care, which provides inpatient admissions and outpatient visits by individuals for all diagnoses):

Table S6. International Classification of Diseases (ICD) codes mapped to Global Burden of Disease cause list for appendicitis incidence data ICD system
ICD codes 10 K35 (K35, K35.0, K35.1, K35.2, K35.3, K35.8, K35.80, K35.89, K35.9) K36 (K36, K36.0) K37 (K37, K37.0, K37.9) 9 540 (540, 540.0, 540.1, 540.9) 541 (541, 541.0, 541.1, 541.2, 541.3, 541.9) 542 (542, 542.0, 542.1, 542.9) In our model, the reference standard for incidence input data was a combination of adjusted inpatient data and claims data from Poland and Taiwan.This reference serves as a benchmark for adjusting other data sources that are known to have a systematic difference.For appendicitis, specifically in the USA, we used two major sources of data: Truven MarketScan claims data and the Healthcare Cost and Utilization Project, or HCUP.HCUP is considered a highly reliable source of information about disease prevalence and health care utilisation.Our database includes HCUP data for 13 USA states.To produce estimates for all 50 states using available data and correcting for errors, we augmented the database with data from Truven, which were available for all states.To ensure comparability and generalisability of the Truven database, adjusted data from USA MarketScan claims towards the reference (HCUP) to account for systematic differences in its incidence measure due to commercial insurance status in the enrollees.The process of adjusting for biases in non-reference data using MR-BRT with the logit-transformation method is described below: 1. Identify datapoints with overlapping year, age, sex, and location between commercial claims (non-reference data) and population-representative hospital discharges (reference data).2. Logit transform overlapping datapoints of alternative and reference types.3. Convert overlapping datapoints into a difference in logit space using the following equation: () − ().4. Use the delta method to compute standard errors of overlapping datapoints in logit space, then calculate standard error of logit difference using the following equation: �(  ()) + (  ()) 5. Using MR-BRT, conduct a random effects meta-regression to obtain the pooled logit difference of alternative to reference.6. Apply the pooled logit difference to all datapoints of alternative case definitions using the following equation:   = .((()) − (  )) 7. Calculate new standard errors using the delta method, accounting for gamma (between-study heterogeneity).
The table below shows bias correction factors estimated using MR-BRT.*MR-BRT crosswalk adjustments can be interpreted as the factor the alternative case definition is adjusted by to reflect what it would have been had it been measured using the reference case definition.If the log/logit beta coefficient is negative, then the alternative is adjusted up to the reference.If the log/logit beta coefficient is positive, then the alternative is adjusted down to the reference.**The adjustment factor column is the exponentiated beta coefficient.For log beta coefficients, this is the relative rate between the two case definitions.For logit beta coefficients, this is the relative odds between the two case definitions.
To address the most substantial sources of heterogeneity, datapoints with an age-standardised incidence rate greater than two median absolute deviations from the median of the age-standardised incidence rate for all data were marked as outliers and excluded from analysis.

Excess mortality estimates
Excess mortality rate (EMR) is defined as the number of deaths due to appendicitis among those who have appendicitis.In previous rounds of GBD, EMR inputs were produced by matching prevalence datapoints with their corresponding CSMR values within the same age, sex, year, and location (by dividing CSMR by prevalence).For short-duration conditions (remission >1), the corresponding prevalence was derived by running an initial model and then applying the same CSMR/prevalence method.However, this method of producing EMR inputs demonstrated a rather unrealistic pattern of EMR compared to an expected pattern of decreasing EMR with greater access to quality health care.Such unexpected patterns often signal inconsistencies between CSMR estimates and the measures of prevalence and/or incidence.Thus, in an effort to provide greater guidance on the expected pattern of EMR, EMR data produced per above were modelled by age, sex and Healthcare Access and Quality (HAQ) Index using MR-BRT, with a prior on HAQ Index having a negative coefficient.The equation used for this regression is shown below.
EMR predictions for each location, year, sex, and for ages 0, 10, 20….100 were used as inputs to our nonfatal model.Details on this new modelling method for EMR inputs used for DisMod is described on pages 465-6 of Appendix 1 of the GBD 2019 Diseases and Injuries Capstone. 4

Cause-specific mortality rates
Cause-specific mortality rate (CSMR) estimates drawn from the fatal estimation process described above in Section 2 were used in estimating incidence.

Modelling
To estimate incidence of appendicitis, a standard DisMod-MR 2.1 model with location-level covariates was used.Details on DisMod have been published elsewhere. 4,18-20Briefly, DisMod is a Bayesian mixedeffects meta-regression tool that uses a Bayesian compartmental model framework that solves differential equations for incidence, remission, mortality, and the resulting balance of prevalent cases.It modulates the relationships between a susceptible population, prevalent cases, and those who remit or die from the disease.DisMod solves these differential equations repeatedly down a geographical cascade, starting with global, then down to seven super-regions, 21 regions, 204 countries and territories, and subnational locations for some countries.At the global level, DisMod uses all available data to fit sex-specific estimates for all epidemiological parameters in a steady-state compartmental model with mixed-effects, non-linear regression.The model fit at the global stage is then combined with fixed effects from predictive covariates and location random effects and used as a Bayesian prior for seven models specific to each of the seven super-regions.This process repeats all the way down to the most detailed locations estimated in GBD.The final age-, sex-, location-, and year-specific burden estimates for each epidemiological parameter are produced by aggregating the estimates of disease frequency in the finest locations back up the geographical cascade.
For appendicitis, data inputs for DisMod-MR 2.1 included incidence data from the clinical administrative sources, cause-specific mortality rates estimated as described in Section 4 of this appendix, and excess mortality rates that were separately estimated using the Healthcare Access and Quality Index.A prior value was set on remission so that all cases remit within two weeks.The minimum coefficient of variation at the regional, super-regional, and global level was set at 0.8.We included HAQ Index as a predictive covariate on EMR with a mean and standard deviation produced from the MR-BRT model described above.The fibre (g per day) consumption covariate was included as a predictive covariate on incidence.Betas and exponentiated values (which can be interpreted as odds ratios) of predictive covariates are shown in the table below.In-sample coverage (by integrand) 1

Disability weight
GBD estimation of disability weights for estimation of years lived with disability (YLDs) has been previously described. 4,7,21,22Briefly, disability weight represents the magnitude of health loss associated with the disease outcome.It ranges from 0 (perfect health) to 1 (death).Disability weights used for GBD are derived from a series of household and web-based surveys that were conducted between 2009 and 2013 in nine countries.The surveys included a series of lay descriptions of two hypothetical people with two distinct health states, and asked the survey respondents to choose which of the two was healthier.This pair-wise comparisons were analysed using probit regression analyses to infer distances between disability weight values for pairs of states, anchoring to a 0-1 scale.For appendicitis, given the acute nature of the disease, we assumed all incident cases experienced a single health state (severe abdominal pain).The health state, lay description, and disability weight used for appendicitis are shown in Table S10.The black line represents the total YLL counts in millions with the grey shade representing the 95% uncertainty intervals.The red line represents the total YLL counts in millions with the light red shade representing the 95% uncertainty intervals.

Figure S6. Global temporal trend of years lived with disability (YLDs) of appendicitis between 1990 and 2021
The black line represents the total YLD counts in millions with the grey shade representing the 95% uncertainty intervals.The red line represents the total YLD counts in millions with the light red shade representing the 95% uncertainty intervals.

Figure S2 .
Figure S2.Data coverage for estimating mortality of appendicitis

Figure S5 .
Figure S5.Global temporal trend of years of life lost (YLLs) of appendicitis between 1990 and 2021

Table S1 . Guidelines for Accurate and Transparent Health Estimates Reporting (GATHER) checklist
For each data source used, report reference information or contact name/institution, population represented, data collection method, year(s) of data collection, sex and age range, diagnostic criteria or measurement method, and sample size, as relevant.Global Health Data Exchange (GHDx) (http://ghdx.healthdata.org/)6Identifyanddescribeany categories of input data that have potentially important biases (e.g., based on characteristics listed in item 5).pp 7, 9-10For data inputs that contribute to the analysis but were not synthesized as part of the study: 7 Describe and give sources for any other data inputs.None For all data inputs: 8

Table S9 . Summary statistical metrics on DisMod model performance of appendicitis
Table below shows the overall predictive performance of DisMod model of appendicitis.

Table S10 . Disability weight and health state of appendicitis
Severe abdominal painThis person has severe pain in the belly and feels nauseated.The person is anxious and unable to carry out daily activities.