Identification of rheumatoid arthritis in German claims data using different algorithms: Validation by cross‐sectional patient‐reported survey data

To evaluate different algorithms for the identification of rheumatoid arthritis (RA) in claims data using patient‐reported diagnosis as reference.

• However, as M06.9 is the most frequently used code for RA, selecting only M05 would result in a high loss of confirmed RA cases.
• We propose to select the criteria for identification depending on the research question, considering the trade-off between a relevant loss of case numbers and precision.
• If only definite RA cases are of interest, requiring a specific medication in addition to ICD-10 codes seems to be most suitable for identifying RA.

Plain Language Summary
Claims data are increasingly used for health services research into rheumatic diseases. For this purpose, it is important to reliably identify persons with rheumatic diseases. This study examined how well the claims diagnosis of rheumatoid arthritis is suitable as the sole criterion and whether additional data increase the probability that the diagnosis is true. For this purpose, a sample of persons with a claims diagnosis of RA was asked whether they really had RA and this was used as reference. The claims diagnosis for RA as a sole criterion was able to identify persons with RA in 80%. The correct identification was even higher if only people were classified as having RA if an anti-rheumatic drug had been prescribed or if anti-inflammatory markers had been examined. However, these additional requirements also resulted in individuals with confirmed RA not being included. Therefore, it depends on the research question which criteria are best suited to identify individuals with an RA diagnosis: if only definite RA cases are of interest, requiring a specific medication in addition to ICD-10 codes seems to be most suitable for identifying RA. However, if all persons with RA are to be included, for example, for questions of health care provision, then the claims diagnosis alone may be more advantageous.

| INTRODUCTION
Health services research focusing on the care of patients with rheumatoid arthritis (RA) relies on data that come from an unselected, representative population. While the German RA registries and cohort studies recruit patients from rheumatology care, 1-3 claims data from statutory health insurances could be a valuable source to recruit an unselected population. About 90% of German inhabitants are covered by statutory health insurance companies. Claims data store information about ICD-10 (International Classification of Diseases, 10th revision) codes, prescribed treatments like medication or physical therapy, the specialty of the visited physician, performed diagnostic procedures and more.
For studies using these data on questions regarding RA, researchers need to define a cohort carefully as different case definitions of RA strongly influence the results. Several studies investigated the prevalence of RA in Germany using claims data. They used a range of case definitions to obtain an RA diagnosis including the prerequisite of two outpatient or one inpatient ICD-codes, laboratory measures of inflammatory markers, specialized care by rheumatologists and specific treatments with disease modifying antirheumatic drugs (DMARDs) or glucocorticoids. [4][5][6][7][8] Depending on the algorithm used, the considered period, and the denominator used to calculate the prevalence, the prevalence of RA ranges between 0.6% and 1.4%. 4,5 As physicians document ICD-codes in claims data for billing purposes, these diagnoses might not always meet the criteria for a clinical diagnosis. It is therefore important to validate case definitions against clinical diagnoses or other external data sources. While there is abundant research on the validity of case definitions of RA patients in US American and Canadian claims data sources, [9][10][11][12][13][14] we could not identify such a study with German claims data. Therefore, the primary aim of this analysis was to compare different algorithms to identify patients with RA in German claims data and to assess their performance compared to the patient confirmation of RA diagnosis as a gold standard. A secondary aim was to compare algorithms with regard to their discriminative properties within the group of persons with a claims diagnosis of RA.

| Sample
We used data from the PROCLAIR project (Linking Patient-Reported Outcomes with Claims data for health services research In Rheumatology). The methods for this project have been described in detail elsewhere. 15 Briefly, data from a large German statutory health insurance Persons in the sample were sent a questionnaire by their health insurance company in June 2015 that (among other things) asked them if they had RA. The phrasing of the question was "What does your attending physician call the disease you are suffering from?" with answer options, "chronic polyarthritis," "rheumatoid arthritis," "rheumatism of the joints," and "other (please specify)." The answer to this question was used as the gold standard for the diagnosis of RA.  Table A1). The Elixhauser comorbidity score 16 was calculated. was done, because we used a stratified sample. The sample was stratified to ensure that we could also analyze underrepresented groups of RA patients (e.g., men younger than 50 years). The weights were calculated as the number of persons in the total BARMER population for that stratum divided by the number of respondents in this stratum.

| Statistical analysis
The weights are reported in Table A9.
As a sensitivity analysis, we calculated the PPV for all patients using the 2015 claims diagnosis of RA.
We performed the statistical analyses using the SURVEY procedures in SAS version 9.4. 17 Additionally, we estimated the sensitivity and specificity, the Youden Index (Sensitivity + Specificity-1), the positive (LR+) and negative likelihood ratio (LRÀ) and the diagnostic odds ratio (LR+/LRÀ) for the considered algorithms to detect RA among those with two claims diagnoses of M05/M6. The LR+ shows the probability to be categorized as having RA given that a person actually has RA divided by the probability that the person is categorized as RA given that the person  Table 1 shows the characteristics of the sample, comparing persons who confirmed the RA diagnosis, those who did not and those who did not respond. The mean ages (65, 63, and 63 years), the Elixhauser comorbidity index (4.1, 4.1, and 3.9) and proportion of women (80%, 78%, and 79%) were comparable among the groups. Prescription rates for csDMARDs, bDMARDs and glucocorticoids were much higher among those who confirmed the diagnosis of RA compared to those who did not or those who did not respond.  unspecified") and M05.8 ("Other seropositive rheumatoid arthritis") were the second most common diagnoses (each 15%). Among the persons not confirming an RA diagnosis, the most common codes were M06.9 (64%), M06.0 (12%, "Seronegative rheumatoid arthritis") and M05.9 (9%), see Appendix Table A2. 3.3 | Consistency of ICD-10 RA diagnosis 2013-2020   Table 3).
The algorithm with the best discriminative properties (the highest Youden-Index, the highest positive likelihood ratio, the lowest negative likelihood ratio and the highest diagnostic odds ratio) was specific medication (diagnostic odds ratio 3.0).  Other studies have examined other collectives and used other gold standards, [10][11][12]14,19 for example, a chart diagnosis or rheumatologist consultation/diagnosis, so that the results are not directly T A B L E 3 Discriminative properties of algorithms to identify rheumatoid arthritis in a population with M05/M06 diagnosis. This seems to be independent of seropositivity and is also seen in other specialties. Longitudinal health insurance data on diabetes, colorectal cancer, and heart failure also show that diagnoses are not always conclusively continued. 20 The results of the study also show that in German claims data, RA is predominantly coded non-specifically (M05.9, M06.9). This is also found in other German analyses, for example, regarding the coding of depression. 21 The predominance of M06 diagnoses is not reported from other countries. In data from Norway, two-thirds of RA cases are coded seropositive (M05) for inclusion criterion with two ICD-10 diagnoses. 21 This means that in German claims data, the proportion of incorrectly coded diagnoses may be higher and the proportion of seropositive RA is probably underestimated when only M05 cases are considered, limiting the generalizability of our data. Therefore, if we choose the inclusion criteria too widely, we end up with too many false positive cases but if we choose the inclusion too strictly and exclude the non-specific codes, as it was approached by Grellmann et al., 8 we lose too many cases that would be true RA cases after all.

| Limitations and strengths
A strength of PROCLAIR is that we obtained the sample from the general German population covering also RA patients who are not in specialized rheumatologic care. Being able to identify also persons who do not see a rheumatologist is the foundation of being able to detect possible deficiencies in healthcare provision. BARMER is one of the largest health insurance companies in Germany and covers around 12% of the persons that have a statutory health insurance. Deviations in the structure of the insured persons (BARMER has a slightly higher proportion of elderly women compared to other insurance companies 22 ) can be taken into account through standardization. 4 Within this large sample drawn, we were able to obtain age-and sex-specific results with a sufficient number of cases in the individual strata. The survey part of our study had a response of 51%, which is high compared to similar studies. 23,24 Another strength of the study is the longitudinal follow-up of ICD-10 codes.
Limitations include differences between survey responders and non-responders, which may occur due to a different willingness to respond to the survey and link the data in persons who actually have RA compared to those who do not. One observed difference is that survey non-responders were treated with DMARDs less frequently compared to responders. An evaluation from PROCLAIR on persons with an osteoarthritis or an axSpA-diagnosis had already revealed that survey responsiveness differed according to age and sex. Moreover, survey responders visited specialists and received health care interventions, such as vaccinations or prescriptions for specific drugs and physical therapy, more frequently than non-responders . 25 While those differences between survey responders and non-responders probably affect the PPVs, there is no reason to believe that they would change the ranking of the most suitable algorithm to identify RA patients in German claims data.
Our algorithms rely on ICD-10 codes recorded by physicians and additional information available in claims data. Yet, we lack data on classification criteria for RA, as well as data on clinical investigations. As we have not contacted any persons without two diagnostic codes of RA in the claims data, we cannot assess sensitivity and specificity for the different algorithms to identify RA in claims data for the general population.
In PROCLAIR, we used patient reported diagnoses as a gold standard instead. Given that the sample population was randomly selected from all areas of Germany and included persons who are not in rheumatologist care, it would not have been feasible to contact all treating physicians and ask them to confirm the existence of a clinical RA diagnosis. We decided to use the patient-reported diagnosis as a proxy for the clinical diagnosis. This is a limitation of our study. Guillemin et al. 26 concluded that self-reported diagnosis is the single most useful item to identify patients with a clinical diagnosis of RA, given that there is no data from the treating physician available directly.

| CONCLUSIONS
The ICD-10 codes M05 and (less optimal) M06 have high PPVs and are therefore feasible to identify persons with RA in German claims data.
Depending on the research question, additional requirements can lead to a more precisely defined cohort at the cost of lower case numbers. We found the additional prerequisite of a prescription of specific medication to be the most useful algorithm considering this trade-off.