Diabetes diagnosis and management among insured adults across metropolitan areas in the U.S.

This study provides diabetes-related metrics for the 50 largest metropolitan areas in the U.S. in 2012—including prevalence of diagnosed and undiagnosed diabetes, insurance status of the population with diabetes, diabetes medication use, and prevalence of poorly controlled diabetes. Diabetes prevalence estimates were calculated using cross-sectional data combining the Behavioral Risk Factor Surveillance System, American Community Survey, National Nursing Home Survey, Census population files, and National Health and Nutrition Examination Survey. Analysis of medical claims files (2012 de-identified Normative Health Information database, 2011 Medicare Standard Analytical Files, and 2008 Medicaid Analytic eXtract) produced information on treatment and poorly controlled diabetes by geographic location, insurance type, sex, and age group. Among insured adults with diagnosed type 2 diabetes in 2012, the proportion receiving diabetes medications ranged from 83% in Oklahoma City, Oklahoma, to 65% in West Palm Beach, Florida. The proportion of treated patients with medical claims indicating poorly controlled diabetes was lowest in Minneapolis, Minnesota (36%) and highest in Texas metropolitan areas of Austin (51%), San Antonio (51%), and Houston (50%). Estimates of diabetes detection and management across metropolitan areas often differ from state and national estimates. Local metrics of diabetes management can be helpful for tracking improvements in communities over time.


Introduction
Diabetes is a major epidemic in the United States, yet many people with diabetes are undiagnosed, uninsured, or have suboptimal health or adherence outcomes (Centers for Disease Control andPrevention, 2014a, 2014b;Dall et al., 2016). Detection, insurance coverage and quality of care for people with diabetes were anticipated to improve with implementation of the Affordable Care Act (ACA), new U.S. Preventive Services Task Force guidelines around screening and treatment, and evolving standards of care (US Preventive Services Task Force, 2015;American Diabetes Association, 2017). The diabetes-related implications of changes in policy and treatment guidelines may take years to become apparent, but evaluating their impact over time requires baseline diabetes metrics for comparison.
Select metrics of diabetes detection and management have been calculated for 2012 at the state and national levels (Dall et al., 2016;Centers for Disease Control andPrevention, 2014a, 2014b). In 2012, 8.1 million adults were unaware they had diabetes (CDC, 2014a) and another 4.9 million with diagnosed diabetes lacked medical insurance (Dall et al., 2016). About 92% of insured adults with diagnosed diabetes had type 2 diabetes, and many of these adults had suboptimal outcomes related to medication adherence, glycemic control, and presence of complications. For example, among insured type 2 patients receiving antidiabetic medications, the proportion with medical claims indicating poor diabetes control ranged from 53% in Texas to 29% in Minnesota and Iowa (Dall et al., 2016).
Limited information is available to assess type 2 diabetes detection and management across metropolitan areas, and the existing information is based primarily on self-reported information collected through telephone surveys (CDC, 2014b). This study extends previously published national and state analyses (Dall et al., 2016) to construct diabetes detection and management metrics for the 50 largest metropolitan areas where over half the nation's population with diabetes resided in 2012. The claims analysis focused on the diagnosed type 2 population with medical insurance. Such information provides local baseline metrics to track progress over time in diagnosing and treating people with type 2 diabetes.

Methods
Our approach to calculate type 2 prevalence, detection and treatment outcomes by metropolitan area is similar to the approach we used to model national and state metrics (American Diabetes Association, 2013;Dall et al., 2014aDall et al., , 2014bDall et al., , 2016. We first estimated the size of the adult population in each metropolitan area by age group (20-34, 35-44, 45-54, 55-59, 60-64, 65-70, and over 70 years), sex, and insurance type (private, Medicare, Medicaid, uninsured). We used survey data to estimate the prevalence of diagnosed and undiagnosed diabetes for each segment of the population (by demographic and insurance type), and used medical and pharmaceutical claims to estimate the proportion of diabetes patients with type 2 and the prevalence of treatment outcomes within each segment. We aggregated results across demographic groups to provide metropolitan-level statistics.
The metropolitan areas modeled are officially designated Metropolitan Statistical Areas (MSAs), with three exceptions: (1) New York-Newark-Jersey City, NY-NJ-PA MSA was split into the New York population and the Northern New Jersey metropolitan designation, (2) Orange County, California was carved out of the Los Angeles-Long Beach-Anaheim, California MSA and reported separately, and (3) West Palm Beach was reported separately from Miami-Fort Lauderdale, Florida (whereas the official designation of this metropolitan statistical area is Miami-Fort Lauderdale-West Palm Beach).
This study used secondary data sources and received an exemption from the New England Institutional Review Board.

Data sources and selection and exclusion criteria
To construct the population file for each MSA we started with data on population size by 5-year age group, sex and race/ethnicity (non-Hispanic white, non-Hispanic black, non-Hispanic other, Hispanic) from the 2012 U.S. Census Bureau Annual County Estimates Population file linked to each MSA (U.S. Census Bureau, 2015a, 2015b). To construct a person-level file for each MSA, we used random selection with replacement from the 2012 American Community Survey (ACS, n = 2,375,715) to draw a sample of people living in a metropolitan area where the number of records drawn reflected the size of each demographic strata defined by 5-year age group, sex, and race/ethnicity among the ACS participants in each state. From the ACS data, we obtained information on household income, medical insurance status, and whether the person resides in the community or in a nursing home. Each person in the constructed population who resides in the community was then matched with a similar person from the 2011 and 2012 Behavioral Risk Factor Surveillance System (BRFSS, n = 982,154) using random sampling with replacement to match by age group, sex, family income level (8 levels), insurance status, metropolitan residency status, and state. For individuals in the ACS file who reside in a nursing home we used random sampling with replacement to match each ACS person with a nursing home resident in the 2004 National Nursing Home Survey (NNHS, n = 13,507) of similar age, sex, and race/ethnicity.
The resulting population file for each MSA contained a representative sample of the population complete with diagnosed diabetes prevalence; demographics; previous diagnosis or history of asthma, arthritis, heart attack, stroke, cancer, hypertension, high cholesterol, and cardiovascular disease; current smoker; body weight defined by body mass index (National Institutes of Health, 2000)-normal (BMI < 25), overweight (25 ≤ BMI < 30), or obese (30 ≤ BMI); and insurance type (Medicare, Medicaid, commercially insured, uninsured). Diagnosed diabetes prevalence, presence or history of the other chronic diseases modeled, and body weight information for the community-based population was self-reported in BRFSS. For the nursing home population, disease status and body weight information came from clinical diagnosis in NNHS. This information allowed us to estimate the prevalence of diagnosed diabetes by insurance type within each MSA.
Estimated prevalence of undiagnosed diabetes for each MSA was constructed by applying a regression-based predictive model (described later) to each person in the representative population sample. This predictive model was estimated using the 2005-2012 files of the National Health and Nutrition Examination Survey (NHANES) for adults without a previous diagnosis of diabetes and who did not use insulin. NHANES is a nationally representative sample of the non-institutionalized population. Approximately one third of NHANES adults were randomly chosen to undergo laboratory tests-including hemoglobin A1c (A1c) and fasting plasma glucose testing (FPG) that we used to determine diabetes, prediabetes, or normal glucose status. The sample analyzed excluded pregnant women (n = 463) and consisted of adults with previously undiagnosed diabetes (n = 1209), prediabetes (n = 7190), or normal glucose levels (n = 10,719).
Diabetes treatment outcomes were calculated using medical and pharmacy claims data for each population strata (e.g. insurance type, age group, and sex), and were multiplied with the population size in the corresponding strata from the constructed population file for each MSA. Medical claims for commercially insured adults in each MSA came from the 2011-2012 OptumInsight de-identified Normative Health Information database (dNHI,n = 29,948,496), for the Medicare population from the 2011 Medicare 5% sample (n = 2,805,812), and for the Medicaid population using the Centers for Medicare and Medicaid Services (CMS) 2008 Mini-Max file (n = 3,095,634). The dNHI database used medical and medication claims and membership data from January 2011 through December 2012, and the file contains longitudinally-linked and statistically de-identified individual-level data from UnitedHealth Group and non-UnitedHealth Group sources. The Medicare 5% sample contains medical and prescription claims. Mini-Max is a 5% sample of the Medicaid Analytic eXtract data-a set of person-level data files on Medicaid eligibility, service utilization, and payments for > 60 million Medicaid enrollees extracted from the Medicaid Statistical Information System.
Patients analyzed in each of these databases were continuously enrolled in a fee-for-service coverage type plan with no more than one gap in enrollment of up to 45 days during the measurement year for the commercially insured population, and were enrolled for all 12 months for the Medicare and Medicaid populations.

Definitions of key outcomes
Key outcomes modeled include: undiagnosed diabetes, type 2 diabetes, receiving medication for diabetes, and poorly controlled diabetes.

Undiagnosed diabetes patients
Patients in NHANES with blood glucose levels indicating diabetes but who had not previously been diagnosed by a health provider and were not taking insulin or oral anti-diabetic medication. Similar to the approach used by CDC (2017), we identified NHANES participants with undiagnosed diabetes as those with FPG ≥ 126 or A1c ≥ 6.5. A limitation of NHANES is no follow-up confirmatory test is available, which introduces false positives and negatives. The predictive model applied to the metropolitan population files to estimate undiagnosed diabetes prevalence has been described elsewhere (Dall et al., 2014a(Dall et al., , 2014b(Dall et al., , 2016. This model used logistic regression with diagnosis status as the dependent variable. Explanatory variables consisted of demographics; presence of diseases included in the constructed population files (e.g., hypertension, cardiovascular disease); overweight, obesity and current smoking status; and household income and insurance status.

Type 2 diabetes
Within the three medical claims databases, we identified patients with diabetes if the patient had at least one emergency department visit or hospitalization or two ambulatory visits (30 days apart) with diabetes diagnosis (International Classification of Diseases, Ninth Revision, Clinical Modification [ICD-9-CM] diagnosis code of 250.xx) submitted during the year, or whether the patient used insulin or other diabetesrelated medications. Sample inclusion and exclusion criteria, as well as the algorithm for distinguishing whether a patient had type 1 or type 2 diabetes based on medications taken and diagnosis codes, are described in detail elsewhere (Dall et al., 2016). The constructed analytic sample only included patients with diagnosed type 2 diabetes and excluded people younger than 18 and adults who had evidence of gestational diabetes or a pregnancy.

Medically treated diabetes
We analyzed pharmacy claims to identify patients receiving treatment for their diabetes. Diabetes treatment was defined as having any pharmacy claims for insulin, non-insulin injectables, or oral antidiabetic agents. National Drug Code therapeutic classes were used to identify the medication (Dall et al., 2016).

Poor control status
Whether a person's diabetes is under control is generally determined by blood glucose levels (ADA, 2017;Christophi et al., 2012), but lab results with A1c values were available only for a subset of the commercially insured patients in the dNHI database and were unavailable for the Medicare and Medicaid populations. To consistently identify patients with poorly controlled diabetes we used ICD-9 diagnosis codes of 250.x2 and 250.x3 in any claim. For discussion, we refer to poorly controlled status for anyone with an ICD-9 code indicating uncontrolled diabetes at some point during the year, and "controlled" status was determined by the lack of a code indicating "uncontrolled" status.

Diabetes prevalence
Diabetes metrics for each MSA are presented in Tables 1 and 2 (levels) and Table 3 (percentages), with confidence intervals in Appendix Tables 1a-3a, respectively. Half (53%) of U.S. adults resided in these MSAs, including 51% of diabetes patients and 55% of adults with undiagnosed diabetes cases (Table 1). New York City had the largest number of adults with diabetes (1.2 million), while Los Angeles had the largest number with undiagnosed diabetes (366,200) and the largest number with diagnosed diabetes who lacked medical insurance (131,900) (Table 1).
Among diabetes patients analyzed, 29% were undiagnosed; among diagnosed patients 17% were uninsured; among insured patients 92% had type 2 diabetes (Table 3). Los Angeles had the largest share of adults with undiagnosed diabetes (37%) as opposed to the lowest share in Cleveland Cincinnati, and Nashville (24%). A significant share of diagnosed diabetes patients lacked medical insurance, being the highest in three Texas MSAs (San Antonio [25%], Houston [25%], and Austin [24%]) and lowest in Boston (9%) and New York (10%). San Antonio had the highest share of insured, diagnosed patients that had type 2 diabetes (96%) in contrast to 88% in the Denver MSA.

Medication use and diabetes control
Among the insured population with type 2 diabetes, 76% had medication claims for antidiabetic agents-ranging from 83% in Oklahoma City to 65% in West Palm Beach (Table 3). Among those patients receiving medication, nationally 44% had medical claims indicating poorly controlled diabetes with local estimates lowest in Minneapolis (36%). MSAs in Texas ranked among the highest for patients with medical claims indicating uncontrolled diabetes-including Austin (51%), San Antonio (51%), Houston (50%), and Dallas-Fort Worth (47%).

Discussion
This study illustrates geographic variation in measures of diabetes prevalence, detection and management in 2012 and serves the following purposes: (1) providing a pre-ACA baseline to track outcomes over time, (2) providing communities with a comparison to state and national benchmarks, and (3) identifying communities performing well that could be used to identify best practices.
There is little published information at the metropolitan level for comparison to our findings. At the national level our results are similar to other estimates for the percentage of diagnosed diabetes that are type 2 and prevalence of undiagnosed diabetes. Our estimates of medication use based on claims data are lower than estimates based on self-reported data (CDC, 2013), while our estimates of patients receiving annual A1c testing and having their diabetes under control are similar to estimates based on electronic medical records (Courtemanche et al., 2013).
Geographic variation in diabetes care has been previously noted, but even with controlling for demographics, the source of the variation remains unclear (Egede et al., 2011a(Egede et al., , 2011bFord et al., 2005;Lynch et al., 2015). Areas for future research include: (1) Understanding how expanded medical insurance coverage under the Affordable Care Act has helped close one gap to receiving treatment (i.e., high rates of uninsured people with diabetes); and (2) How new and expanded guidelines for screening asymptomatic adults for prediabetes and diabetes might help close another gap (i.e., high prevalence of undiagnosed diabetes).

Study strengths and limitations
The strengths of this study are use of large medical claims files covering the commercially insured, Medicare, and Medicaid populations to provide insight on treatment patterns, prevalence of diabetesrelated complications, and health care expenditures. This analysis of medical claims provides a nice comparison to diabetes treatment and management statistics based on self-reported survey data collected through telephone survey (BRFSS) and suggests that self-reported data might overstate the proportion of patients receiving treatment for their diabetes.
Data limitations include the following: • Claims data based case identification algorithms for type 2 diabetes may not be completely accurate. Algorithms that use a combination of both physician claims data and hospital discharge data have a sensitivity ranging from 57% to 95.6%, specificity ranging from 88% to 98.5%, positive predictive values ranging from 54% to 80%, and negative predictive values ranging from 98% to 99.6% (Khokhar et al., 2016). However, it has also been shown that more involved algorithms similar to ours that use a combination of physician claims, facility claims, and prescription drug claims and have multiple rules to differentiate type 1 from type 2 diabetes do have higher sensitivities (97% [95% CI 87-100]) for identifying type 1 diabetes and 93% [85-98] for type 2 diabetes (Klompas et al., 2013).
• For some MSAs, the medical claims sample was small for some demographic groups (in particular the age 20-34 population where diabetes prevalence is low). When the sample size fell below 30 adults for a particular demographic group, we then used information for that same demographic group at the next higher level of aggregation. For example, if the sample size for percent of diagnosed patients who were type 2 (versus type 1) was below 30 for a particular MSA, then we used the corresponding percentage at the state level for that demographic group. If the state sample size was below 30, then we used the corresponding percentage at the national level for that demographic group. Metropolitan-level estimates, therefore, were a combination of metropolitan, state and national metrics. This approach provides stability to estimates, but also biases the estimates towards the state and national averages.

• The medical claims analysis only includes insured patients in fee-
for-service plans-as medical claims are often incomplete for patients in capitated managed care plans. It is unclear how this might bias study results. Medicare patients with poorer health might choose a fee-for-service plan because it gives them greater access to physicians not in a managed care network, but this would unlikely affect diabetes detection, treatment and control.
• Uncontrolled status was based on ICD-9 diagnosis codes and not by independent lab results. For validation, we analyzed the subset of commercially insured patients with type 2 diabetes for which we had both ICD-9 and A1c information. We found a significant and positive correlation between ICD-9 and A1c-based case identification measures (Spearman rank correlation coefficients of 0.22 [P < 0.001] for A1c > 9%). Patients who are sicker will tend to have more touch points with the health care system and thus generate more medical claims, so there is a higher likelihood that they might be categorized as uncontrolled diabetes. However, even with this limitation, ICD-9 diagnosis code-based definition for uncontrolled diabetes is still a valid and unbiased measure to be compared across geographic locations, especially if we have no reason to believe that physicians in different MSAs will have different tendency to use these diagnosis codes. Prior to the implementation of the ICD-10 system, Agency for Healthcare Research and Quality (AHRQ) diabetes quality measures included a measure on "uncontrolled diabetes admission rate" that also used ICD-9 codes to compare the quality of diabetes care across managed health plans (AHRQ, 2013).
• Medical claims data were unavailable for care veterans receive through the Veterans Health Administration and for patients in programs such as the Indian Health Service. For these governmentsponsored programs adults age 65 or older are counted under Medicare, and adults under age 65 are counted under Medicaid.

Conclusion
This study used published information on the population and health characteristics of the population in the 50 largest metropolitan areas in the U.S. combined with analysis of medical claims files to estimate key metrics tracking access to care and treatment outcomes for people with diabetes (with the focus on type 2 diabetes). The study highlights that key diabetes prevalence and management metrics vary by MSA and can differ substantially from the national averages. Such information helps increase awareness of the areas where improvements can be made to inform strategies to improve diabetes screening and treatment, and tracked over time can help inform population health management strategies. Note: highest and lowest MSA estimates are bolded.

Funding
Funding for this study was provided by Novo Nordisk Inc. Study coauthor Erin Byrne (EB) is employed by Novo Nordisk Inc. and provided critical review and revision of the manuscript.

Conflict of interest
The authors report the following competing interests: WY, TMD, ET, WI, RC, and FEL provide paid consulting services to pharmaceutical companies and other Life Sciences organizations. EB is employed by Novo Nordisk Inc.

Acknowledgments
Appreciation is expressed to Jerry Franz, who commented on earlier versions of this paper. Note: highest and lowest MSA estimates are bolded.