Data on clinical and economic burden associated with pulmonary arterial hypertension related hospitalizations in the United States

A comprehensive description of the contemporary trends in pulmonary arterial hypertension (PAH) related hospitalizations, associated inpatient outcomes and predictors of worse outcomes were reported in our paper recently published in the International Journal of Cardiology [1]. Our observational analysis utilized ten year of national inpatient sample from January 1st 2007 through December 31st 2016. This Data in Brief companion paper aims to report the specific statistical highlights of the entire ten-year PAH cohort including demographics, hospital characteristics, regional variation, prevalence of comorbidities, and multivariable regression analysis used to examine the factors associated with increased inpatient mortality and prolonged length of stay. Additionally, we report trends in the cost (the actual amount of money reimbursed to the hospitals) of PAH related hospitalizations over the past ten years.

Cardiology and Cardiovascular Medicine Specific subject area Pulmonary Hypertension Type of data Table  Figure How data were acquired Data obtained from the official website of Healthcare Cost & Utilization Project and was the analysed by the authors using Stata 15. Data format Secondary data Analysed Descriptive Parameters for data collection Sample: National inpatient sample was used to identify patients with primary discharge diagnosis of pulmonary arterial hypertension using ICD-9 code 416.0 from 01/01/2007 through 09/30/2015 and ICD-10 code I27.0 from 10/01/2015 through 12/31/2016 [2]. Records suggestive of secondary causes of pulmonary hypertension (WHO category II-V) were excluded. Parameters: Demographics, comorbidities, admission characteristics (elective vs. non-elective, primary payer, disposition status), and hospital characteristics (US region, government vs. private, teaching vs. non-teaching, hospital bed-size), income status, primary payer, discharge disposition, charges, costs, length of stay, all-cause mortality. Description of data collection National inpatient sample is the largest, publicly available, all-payer administrative claims database of inpatient hospitalizations in the US under Healthcare Cost & Utilization Project. It represents a random, 20% stratified sample of all inpatient hospitalizations from approximately 10 0 0 non-federal hospitals in 46 states (representing > 97% of the total US population) and includes approximately 7 to 8 million hospitalizations per year [ 2 ]. Data

Value of the Data
• The data presented here provide real world statistics on the demographics, racial distribution, regional variation, comorbidities, economic burden and in-hospital outcomes in Pulmonary arterial hypertension (PAH) patients. These data would help clinicians appropriately risk stratify the hospitalized PAH patients • Our additional analysis on the hospital characteristics, regional variation and insurance status of PAH-related hospitalizations may be useful to the insurance companies and policymakers by making them aware of the disparities in the economic burden and outcomes in these patients at the institutional and regional levels. • The analysis reported in this Data in brief paper may hopefully promote and advocate for the need for continued research and developments of treatment strategies aimed at decreasing the hospitalizations, mortality and financial burden in PAH patients. • The methods described in this study can be useful for future studies on PAH using national inpatient sample.

Data Description
Data included analysis of PAH-related hospitalizations from January 1st 2007 through December 31st 2016 as reported in the related article [1] . Table 1 describes the overall demographics, comorbidities, hospital characteristics, regional variation, primary payer (insurance status) as well as inpatient outcomes including mortality, length of stay and charges (and costs) per PAH hospitalization. Mean age of the cohort was 38.7 years and 78.8% were female. Among these hospitalizations, 57.8% were Caucasians, 14.8% were African-Americans and 16.3% were Hispanic. Majority of patients were admitted to urban teaching (82.4%), private not-for-profit hospitals (78%) and large bed-size hospitals (76.1%). Patients were variably distributed among different US regions with highest proportion in South (31.4%) followed by West (29.7%), Northeast (22.8%) and Midwest (16.1%). Cohort had high comorbidity burden with mean Elixhauser comorbidity index 3.2. The prevalence of heart failure was highest (32%) followed by fluid and electrolyte disorders, uncomplicated hypertension, chronic pulmonary disease, obesity, congenital heart disease, valvular heart disease, depression, hypothyroidism, coagulopathy, arrhythmia, uncomplicated diabetes, acute respiratory failure and acute kidney injury ( Figs. 1 and 2 ). Cohort had an overall IM of 6.03%, mean length of stay 7.6 days, mean inflation adjusted charges of $89,400 and mean inflation adjusted cost of $26,200. Table 2 describes the multivariate analysis to identify predictors of inpatient mortality in patients with PAH related hospitalizations. The presence of right heart failure, cardiac arrhythmia, neurological disorders other than paralysis, fluid and electrolyte disorders, psychosis and increased length of stay were amongst factors associated with increased odds of inpatient mortality whereas younger age, Hispanic race and elective admission were associated with decreased odds of inpatient mortality. Table 3 describes the multivariate analysis to identify predictors of length of stay in patients with PAH. Increased age, female gender, admission in a hospital in Midwest and alcohol use disorder were associated with decreased length of stay whereas presence of coagulopathy, and fluid and electrolyte disorders were associated with increased LOS. Table 4 describes the comparison in the trends of the charges of the PAH-related hospitalizations compared to other common conditions like acute myocardial infarction, acute respiratory failure and others. While there had been a significant increase in the charges of all the conditions listed; the increase in the charges associated with pulmonary arterial hypertension related hospitalizations were significantly higher than the charges associated with acute myocardial infarction, acute respiratory failure and others ( Fig. 3 ). Table 5 describes the comparison in the trends of the costs of the PAH-related hospitalizations compared to other common conditions like acute myocardial infarction, acute respiratory failure and others. While there had been a significant increase in the costs of all the conditions listed; the increase in the costs associated with pulmonary arterial hypertension related hospitalizations were significantly higher than the costs associated with acute myocardial infarction, acute respiratory failure and others ( Fig. 4 ). Fig. 1 is a graphical representation of the distribution of Elixhauser comorbidities (as was also described in Table 1 ) in patients with PAH. Fig. 2 is a graphical representation of the distribution of other specific comorbidities (as was also described in Table 1 ) in patients with PAH.  All-Cause inpatient mortality, % 6.03 Length of stay, days 7.6 ± 0.5 Total Charges, $ (x1,0 0 0) 84.1 ± 6.2 Inflation adjusted charges, $ (x10 0 0) 89.4 ± 6.6 Total Cost, $ (x1,0 0 0) 24.6 ± 1.8 Inflation adjusted cost, $ (x1,0 0 0) 26.2 ± 1.9 Total number of PAH discharges are expressed as N. Continuous variables are expressed as mean ± SE and categorical variables are expressed as frequencies (%). PAH: Pulmonary arterial hypertension; SE: Standard error. * All the race categories shown in the table do not add up to 100% due to some patients being categorized as "other" in the NIS.Note: Pulmonary circulation disorder is an Elixhauser comorbidity that was not included in the analysis because PAH comes under this broad category of comorbidity.      Continuous variables are expressed as mean ± SE. p < 0.05 was considered statistically significant. PAH-Pulmonary arterial hypertension; AMI-Acute myocardial infarction; ARF-Acute respiratory failure; SE-Standard error.   Fig. 3 is a graphical representation of the temporal trends in the hospitalization charges of PAH compared to acute myocardial infarction, acute respiratory failure and all other diagnoses from 2007 through 2016 (as described in Table 4 ).
Bars indicate standard errors and solid lines indicate mean charges for respective diagnosis. P < 0.05 was considered statistically significant. PAH-Pulmonary arterial hypertension; AMI-Acute myocardial infarction; ARF-Acute respiratory failure. Fig. 4 is a graphical representation of the temporal trends in the hospitalization costs of PAH compared to acute myocardial infarction, acute respiratory failure and all other diagnoses from 2007 through 2016 (as described in Table 5 ).
Bars indicate standard errors and solid lines indicate mean charges for respective diagnosis. P < 0.05 was considered statistically significant. PAH-Pulmonary arterial hypertension; AMI-Acute myocardial infarction; ARF-Acute respiratory failure.

Data source
The Healthcare Utilization Project (HCUP) is a family of databases developed through a Federal-State-Industry partnership and is sponsored by the Agency for Healthcare Research and Quality. National Inpatient Sample (NIS) database is an HCUP database that is the largest, publicly available, all-payer administrative claims database of inpatient hospitalizations in the US. It represents a random, 20% stratified sample of all inpatient hospitalizations from approximately 10 0 0 non-federal hospitals in 46 states (representing > 97% of the total US population) and includes approximately 7 to 8 million hospitalizations per year [2] . A discharge weight is provided for each patient discharge record to represent the relative proportion of the total US inpatient hospital population for each record, allowing for calculation of national estimates [3] . Therefore, the PAH cohort represented in this study is broadly representative of PAH population within US. The NIS database includes de-identified information on patient demographics, and clinical data including primary and secondary discharge diagnoses, comorbidities, and outcomes for each sampled hospitalization. All diagnoses and comorbidities are available in the NIS database as International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes until September 30, 2015, and as International Classification of Diseases, Tenth Revision, Clinical Modification/Procedure Coding System (ICD-10-CM/PCS) from October 2015 onwards. Since this is an analysis of publicly available de-identified data, our Institutional Review Board guidelines stipulate that board approval of the study and need for informed consent are waived.

Data availability
HCUP NIS data are available from the AHRQ, Rockville, MD (https://www.hcupus.ahrq.gov/nisoverview.jsp). HCUP data are available to all researchers following a standard application process and signing of a data use agreement. The authors confirm that they had no special access to the data used in this study (2007-2016. The authors paid a fee to access the NIS data used in this study, in accordance with the fee schedule posted in the HCUP Central Distributor, the entity that accepts, processes, and fulfills applications for the purchase of HCUP databases and manages data use agreements (DUAs) for all data users (https://www.hcup-us.ahrq.gov/tech_assist/centdist.jsp). Researchers interested in purchasing and using HCUP databases will be required to complete the Web-based HCUP DUA (https://www.hcup-us.ahrq.gov/tech_assist/dua.jsp) and read and sign the HCUP DUA. Further instructions for submitting an application for purchasing HCUP Databases can be found at (https://www.distributor.hcup-us.ahrq.gov).

Validation and quality control
Annual data quality assessments of NIS are performed to maintain the internal validity of the database. Estimates from the NIS are compared with the American Hospital Association Annual Survey Database, National Hospital Discharge Survey from the National Center for Health Statistics, and the MedPAR inpatient database from Centers for Medicare and Medicaid Services [4] .

Study population
We identified all patients in the NIS database from January 1, 2007, through December 31, 2016, with a primary discharge diagnosis of PAH using an ICD-9-CM primary diagnosis code of 416.0 and ICD-10-CM/PCS primary diagnosis code of I27.0. Discharge records suggestive of secondary causes of pulmonary hypertension (WHO Category II-V) were excluded. These included left heart (systolic or diastolic) failure, chronic obstructive pulmonary disease, interstitial lung disease, mitral or aortic valve disease, atrial flutter/fibrillation, coronary artery disease, complicated hypertension and diabetes, end stage renal disease, metastatic cancer and paralysis. This yielded a final cohort of 6162 records with PAH as primary diagnosis as described in the related original article [1] .

Covariates
For each discharge record with primary diagnosis of PAH, we obtained the following variables: patient demographics, comorbidities, admission characteristics (elective vs. non-elective, primary payer, disposition status), and hospital characteristics (US region, government vs. private, teaching vs. non-teaching, hospital bed-size). Patient's income status was defined per HCUP's quartile classification of the estimated median household income of residents in the patient's ZIP Code, indicating the poorest to wealthiest populations [5] . Primary payer status was categorized as Medicare, Medicaid, private insurance and uninsured. Disposition at discharge was categorized as routine, short term hospital, nursing home or rehabilitation and home health care [5]. Hospital size was defined as small, medium, and large as per HCUP criteria [6] .
Comorbidity burden was assessed using Elixhauser comorbidities and index based on coding algorithms developed by Elixhauser et al. [7] and Quan et al. [ 8 ] and have been previously validated as predictors of outcomes in administrative databases [ 9 , 10 ]. We also compared distribution of few other specific inpatient comorbidities that are clinically pertaining to PAH patients (e.g. congenital heart disease, syncope, cardiogenic shock, cirrhosis, portal hypertension, hepatitis C, respiratory failure, pneumonia, kidney disease, acute cerebrovascular disease). These comorbidities were identified using valid ICD-9-CM and ICD-10-CM/PCS codes [ 11 , 12 ].
The raw hospital charges provided in the NIS database represent the amount of money billed by the hospitals for services rendered but it does not provide the amount of money hospital services actually cost or the specific amounts that hospitals received in payment. The HCUP Cost-to-Charge Ratios enable this conversion to provide actual expenses incurred in the production of hospital services, such as wages, supplies, and utility costs. The cost of each inpatient stay was estimated by multiplying the total hospital charge with the costto-charge ratio. We report the charges and cost for each year after adjustment for inflation with reference to 2016 U.S. dollar value, using the latest Consumer Price Index data (http://www.bls.gov/data/inflation_calculator.htm).

Outcomes
Our primary outcomes included total hospitalizations with the primary discharge diagnosis of PAH, associated length of stay, all-cause inpatient mortality and charges (and costs) per hospitalization. As secondary outcomes, we examined the predictors of inpatient mortality and prolonged length of stay along with trends in the charges and costs of PAH hospitalization compared to other common causes of hospitalization like acute myocardial infarction, acute respiratory failure and all other diagnoses.

Statistical analysis
National estimates for total number of discharges with primary diagnosis of PAH per year (from 2007 through 2016) were generated from the NIS database using trend weights and published HCUP methods [13] . Continuous variables are reported as mean ± standard error, and categorical variables are reported as frequency (percentages). Trends were analyzed using linear regression for continuous variables. Multivariate regression models were generated to identify factors associated with all-cause inpatient mortality and length of stay. The clinically relevant variables and those with p < 0.3 in the univariate analysis, were included in the corresponding multivariable-adjusted models. All analyses in the study were performed according to recommended AHRQ/HCUP methods to account for complex survey design, cluster, stratification and weighting, and in agreement with the best research practices for conducting research using the NIS database. All statistical analysis was completed using STATA 15.0 (StataCorp, College Station, TX, USA). P value < 0.05 was considered for statistical significance.

Ethics Statement
This is an analysis of publicly available de-identified patient data. Therefore our Institutional Review Board guidelines stipulate that board approval of the study and need for informed consent are waived.

Declaration of Competing Interest
All authors declare that they have no known competing financial interests or personal relationships which have, or could be perceived to have, influenced the work reported in this article.