Data regarding fracture incidence according to fracture site, month, and age group obtained from the large public health insurance claim database in Japan

The National Database of Health Insurance Claims and Specific Health Checkups of Japan includes all health insurance claims submitted in Japan and is considered representative of almost all health claims in Japan. Data regarding fracture incidence, based on the documented diagnoses in the claims and relevant procedure codes, were extracted from the National Database of Health Insurance Claims and Specific Health Checkups of Japan. This data paper includes fracture incidence according to fracture site, month, and age group for the population in Kanto area (Tokyo and surrounding areas), which consists of approximately 42 million people. These data provide supplementary material to be interpreted for the article “Variation in Fracture Risk by Season and Weather: A Comprehensive Analysis across Age and Fracture Site Using a National Database of Health Insurance Claims in Japan” Hayashi et al., and serve as one of the largest epidemiological datasets regarding seasonal differences in fracture incidence according to fracture site and age group.

epidemiological datasets regarding seasonal differences in fracture incidence according to fracture site and age group. © 2019 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons. org/licenses/by/4.0/).

Data
The data described in this article represent the number of cases of peripheral fractures stratified according to fracture site, calendar month, and age group, based on health insurance claims submitted by the healthcare providers for the population of approximately 42 million in Kanto area (Tokyo and surrounding areas) in Japan between April 2013 and March 2016 (Tables 1 and 2). The dataset provides comprehensive coverage on the incidence of peripheral fractures, and contains the incidences of all peripheral fracture sites and all age groups, from children (0e19 years) to the elderly (80 years). The data also describe the incidences of fractures for each calendar month, providing quantitative data for seasonal variation of fracture incidences. Cases involving fractures were extracted from the National Database of Health Insurance Claims and Specific Health Checkups of Japan (NDB), one of the largest healthcare-related databases in the world. The total number of fracture cases in the data was 508,051. The cases for this data were extracted from the NDB using diagnosis codes and procedure codes specific to fractures. The codes and algorithms used to extract data from the NDB are shown for transparency and reproducibility of the data [2]. The data contains all health insurance claims submitted in the area and is representative of the incidence of the population. Specifications table 1 Subject area Orthopedic surgery, Healthcare-related database More specific subject area Epidemiology of fracture Type of data  [1].

Value of the Data
The dataset consisted of comprehensive epidemiological data of fractures across all age groups and fracture sites. The incidence of fractures was described in a large population of >40 million, based on one of the world's largest health databases. This is one of the largest datasets describing the seasonal variation of fracture incidence, including >500,000 fracture cases.
The codes and algorithms used to extract data from the database are described in this article for greater transparency and reproducibility. These data could be used as a benchmark in epidemiological research into fractures, because of the scale and completeness of the sample.

Extraction of data from the original database
NDB is a database of all monthly claims of public health insurance in Japan, including all procedural codes, International Statistical Classification of Diseases, Tenth Edition (ICD-10) codes, and prescriptions, across inpatient and outpatient services. Because of the wide coverage of public health insurance, the NDB is considered representative of almost all health claims in Japan. We applied to use the NDB as members of a research group funded by Health Science and Labor Research Grant from the Ministry of Health, Labour and Welfare, Japan, and permission was granted. We also obtained approval by the appropriate Institutional Review Board. An isolated database was created in the research group and consisted of claim data collected from the original NDB database between April 2013 and March 2016.

Matching more than one claim to the same individual
Although the NDB used two personal identification variables (hereafter referred to as ID1 and ID2) to link individual patients' insurance claims, the efficiency of this process was limited. Therefore, we used another identification variable (hereafter referred to as ID0), which was created by applying a patient-matching algorithm based on the ID1 and ID2 variables, as described previously [3].

Inclusion criteria for the claim data
Claims that fulfilled the inclusion criteria (Table 3) were extracted from our database with the ID0, procedural code, date of application for the procedure, date of hospitalization (if applicable), ICD-10 code, date of documentation for ICD-10 code, prefecture code, and age-group code. Fracture sites were then classified according to the fracture sites listed in Table 2B, using the ICD-10 codes in the claims.

Definition of cases
A case was defined as the first incidence of fracture to one of the sites shown in Table 2B between April 15, 2013, and March 17, 2016. Fracture incidence included records of claims with the fracturespecific treatment codes shown in Table 4A and the ICD-10 codes shown in Table 4B. Cases involving multiple fractures were considered single cases if multiple fractures occurred in only one group of sites; fractures that occurred in different groups of sites were classed separately for each group. Recurrent fracture cases that occurred in the same group of sites were excluded.

Exclusion criteria
Cases in which any pair of the following days, the day of documentation of diagnosis, the day of application of the treatment procedure, or the day of hospitalization, were more than two weeks apart were excluded in an attempt to omit hospital-acquired cases of fractures but include nosocomial fracture cases in which the documentation of diagnosis or treatment occurred several days after admission.

Definition of the date of fracture incidence
The date of the first visit to a hospital/clinic for a fracture was considered a proxy for the date of fracture, as the claim data did not include the date of injury.
We created two interim datasets, Dataset A and Dataset B, to accurately describe the date of first visit for the fractures for clinics/hospitals. Dataset A was created for collecting an accurate number of Table 3 Criteria for the extraction of claims from the original database.

Purpose
Criteria for extraction (Claims that fulfilled all criteria were extracted) Dataset A Data for extracting cases 1. Claims for both inpatient and outpatient services, submitted by clinics or hospitals located in Kanto area (Tokyo and the six surrounding prefectures) 2. Claims that included the treatment procedure codes listed in Table 2A 3. Claims that included one of the ICD-10 codes listed in Table 2B   The earliest date on which the documentation of diagnosis, the application of the treatment procedure, or hospitalization occurred according to Dataset A was defined as the date of the first visit, as long as there was no claim involving the documentation of fracture diagnosis in the same group of sites in another hospital/clinic in Dataset B within the previous 14 days. For cases in which claims included the documentation of fractures in the same group of sites in other hospitals/clinics in Dataset B within the previous 14 days, the date of the earlier visit was considered the date of the first visit.

Statistical analysis
The numbers of fracture cases were accumulated and stratified according to the group of fracture sites involved, based on ICD-10 classification, and sub-classified into the following five age groups: 0e19, 20e39, 40e64, 65e79, and 80 years. All analyses were performed using SPSS (version 24). The terms of use for the NDB prevented us from reporting fracture sites with 10 cases, to protect patient privacy. For fracture sites with lower incidence rates, the upper limits of the incidence ranges were reported instead of precise values.