Evaluation of reporting quality of cohort studies using real-world data based on RECORD

doi:10.21203/rs.3.rs-2407386/v1

Download PDF

Research Article

Evaluation of reporting quality of cohort studies using real-world data based on RECORD

https://doi.org/10.21203/rs.3.rs-2407386/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 29 Jun, 2023

Read the published version in BMC Medical Research Methodology →

You are reading this latest preprint version

Objective: Real-world data (RWD) and real-world evidence (RWE) have been paid more and more attention in recent years. We aimed to evaluate the reporting quality of cohort studies using real-world data (RWD) published between 2013 to 2021 and analyze the possible factors.

Methods: We conducted a comprehensive search in Medline and Embase through the OVID interface for cohort studies published from 2013 to 2021 on April 29, 2022. The evaluation was based on the REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement. Pearson chi-square test or Fisher's exact test and Mann-Whitney U test were used to analyze the possible factors, including the release of RECORD, journal IFs, and article citations.

Results: 187 articles were finally included. The mean±SD of the percentage of adequately reported items in the 187 articles was 44.7±14.3 with a range of 11.1% to 87%. Of 23 items, the adequate reporting rate of 10 items reached 50%, and the reporting rate of some vital items was inadequate. The reporting of only two items significantly improved after the release of RECORD and there was no significant improvement in the overall report quality. The journal IFs and citations were respectively related to 6 areas and significantly higher in high-reporting quality articles.

Conclusion: The reporting of cohort studies using RWD was generally inadequate and has not improved in recent years. We encourage researchers to endorse relevant guidelines when utilizing RWD for research.

RWD

RWE

cohort studies

reporting quality

RECORD

Real-world data is defined as the data relating to patient health status and/or the delivery of health care that is routinely collected from a variety of sources, such as patient registries, electronic medical records (EMRs), electronic health records (EHRs), insurance claims, and patient health records [1,2], by the US Food and Drug Administration [3]. Real-world evidence (RWE) generated by studies using RWD could be the complement of clinical trials for observing the effectiveness or safety of drugs, products, operations, or any other treatment measures and play an increasingly significant role in decision-making while it is more reflective of clinical practice compared to the evidence generated by randomized trials [4–7]. However, the reporting of studies using RWD currently exists with the problems of inadequate and lack of transparency, which will limit the reproducibility and replicability of studies and arouse concerns, doubts, and reductions of confidence in RWE [8–11]. Some reporting guidelines have been established to standardize studies reports and promote the quality of RWE [9,12–15].

The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement [12], which is an extension of the STrengthening the Reporting of OBservational studies in Epidemiology (STROBE) statement [16], was created to address the specific reporting issues of studies using routinely-collected data. Previous studies have shown the reporting of observational studies was usually insufficient whether based on RECORD or STROBE [17–19].

According to the RECORD checklist, we evaluated the quality of reporting on specific aspects of studies using RWD, such as codes and algorithms, data linkage and cleaning, and discussion of peculiar limitations. We concentrated on cohort studies that compared the effectiveness or safety of exposure factors, since cohort studies are helpful to provide evidence indicating causality, the strength of correlation between exposure factors and outcomes, and can usually produce highly generalizable results [20,21] while comparative research could complement or assess the evidence originated from randomized trials and inform decisions about health policy and clinical care [22]. The RECORD checklist was transformed into a series of questions for convenient and accurate evaluation.

We aimed to evaluate the reporting quality of cohort studies using RWD published from 2013 to 2021 and analyze the possible factors of reporting quality. Comparative analyses were conducted to ascertain whether the report quality is related to the release of the RECORD statement, journal impact factors (IFs), and citations of individual articles.

2.1. Eligibility of studies

We selected cohort studies that aimed to compare the effectiveness and safety of the exposure factors and used real-world data, including prospective, retrospective, and both. For example, we excluded studies that classified populations by disease type [23]. Studies on morbidity, mortality, or hazard factors of diseases and economic benefits or medical expenses of exposure factors were excluded. We only considered the English language articles published between 2013 to 2021. For the wider applicability of research results, no more restrictions on research diseases or participants, exposure measures, and the settings of control groups.

2.2. Search strategy

We conducted a comprehensive search in Medline and Embase through the OVID interface for English language articles published between 2013 to 2021(Search on April 29, 2022). The indexing terms included study design, data sources of studies (e.g., “routinely collected data”, “health information system”, “electronic medical record”, “registry”), excluded publication types or article types (e.g., “review”, “protocol”, “meta-analysis”) and excluded outcomes. The search filters developed by Scottish Intercollegiate Guidelines Network [24] and the strategy developed by Lars G. Hemkens et al [19] were integrated into our search strategy (available in S1 File).

2.3. Screening and data extraction

We imported search results into Note Express (Beijing Aegean Hailezhi Technology Co., Ltd., Beijing, China) for management, removing duplicated articles and screening. First, articles that did not conform to the eligibility criteria were excluded by independently screening the titles and abstracts by two reviewers (R.Z. and W.Z.). Subsequently, we downloaded all the available full texts of preliminary included articles. Two reviewers (R.Z. and W.Z.) read each full-text article to screen out the articles that were ultimately included and record the reason for exclusion. Any discrepancies in the screening process were resolved via discussion or determined by a third author (one of B.W., Z.D.Z, C.H.).

We extracted the characteristics of each qualified article including the year of publication, country of the corresponding author, type of disease, journal name, journal IFs in the year of publication, citations, type of therapy, and type of data source. To guarantee the accuracy of the journal names and journal impact factors, we identified the journals by DOI, PMID, or site link provided in the articles and searched the ISSN of journals on the Web of Science to obtain the journal impact factors in the year of publication. Article citations were accessed on Google Scholar. Two reviewers (two of R.Z., W.Z., Z.D.Z, and C.H.) completed the process of extraction and recorded data in Microsoft Excel (Microsoft Corporation, USA). Any discrepancies were resolved via consensus.

2.4. Evaluation

Included articles were evaluated by the RECORD checklist, an extension of the STROBE checklist. Considering that some items of the RECORD checklist contain various aspects, we split and transformed the 13 items of the RECORD checklist into 23 questions that can be answered “yes”, “partly yes”, “no”, or “not applicable”. Reported as “yes” was considered adequate reporting while reported as “partly yes” or “no” was considered inadequate reporting. Some questions may not correspond to the RECORD checklist, but they are emphasized in the explanations of the RECORD statement and are necessary to be reported. For the split items, we renamed the different items with lowercase letters (e.g., R1.1 split to R1.1a, R1.1b). Five items appear to be inapplicable, in brief, three items (R1.3, R6.3, and R12.3) would be inapplicable if the study did not require database linkage and two items (R6.2 and R19.1e) would be inapplicable if the study did not report the codes or algorithms for population selection. We considered that other items should be reported in studies using RWD and therefore applied by default. The complete list of questions, descriptions, and examples can be obtained in S1 Table.

We randomly selected 10 included articles and four reviewers (R.Z., W.Z., Z.D.Z, C.H.) preliminary evaluated them through the constructed 23 questions. Afterwards every reviewer proposed problems and suggestions generated in the preliminary evaluation for a panel discussion to eliminate all discrepancies and ameliorate the 23 questions. Finally, two independent reviewers (R.Z. and W.Z.) evaluated each of included articles and integrated the results in Microsoft Excel (Microsoft Corporation, USA), any discrepancies were resolved via discussion or determined by two other authors (Z.D.Z, C.H.).

2.5. Data analysis

We respectively performed horizontal and vertical calculations. For each item, the number of answers that were “yes”, “partly yes”, “no” and “not applicable” was calculated, as well as the percentage after removing the inapplicable items. For each article, the number of each type of answer was similarly calculated, as well as the rate of adequate reporting which was regarded as the overall quality of the individual articles.

To compare the reporting quality before and after the release of RECORD (released on October 6, 2015), we conducted a before-and-after analysis, with articles published from 2013 to 2015 defined as Pre-RECORD articles and articles published from 2018 to 2021 defined as Post-RECORD articles. Considering the time required to publish the article and the dissemination of RECORD, a two-year interval was allowed.

The median and interquartile range (IQR) were used to describe continuous variables. We compared the reporting of each item of Pre-RECORD articles and Post-RECORD articles using the Pearson chi-square test or Fisher's exact test and calculated odd ratios and 95% confidence intervals. Due to the non-normal distribution of the data, the Mann-Whitney U test was used to analyze the correlations of journal IFs and citations with reporting quality. We also set a 50% reporting rate as a threshold to compare the difference in impact factors and citations between articles with higher reporting quality and others reporting relatively lower. Articles without IF or citations would not be included in the analysis. A p-value less than 0.05 with two-sided was considered a statistically significant difference and all statistical analyses were performed on SPSS v26.0 (IBM Corp., Armonk, NY, USA).

3.1. Screening results and characteristics of included studies

A total of 5824 articles were identified, and 187 articles were finally included after the screening (Figure 1). The complete list of included articles is available in S2 File. The number of articles has been on the rise in recent years. Articles from the USA were at most (56,29.9%). Cardiovascular diseases (65,34.8%) and carcinoma/neoplasms (39, 20.9%) are the most studied diseases. The most common therapy is medication (144,77%). Most studies used a single database (121, 64.7%), with multicenter registries accounting for the majority (116, 62%). Details about the characteristics of included studies are shown in Table 1.

Table 1. characteristics of included studies

Characteristics	Studies [n (%)]	Characteristics	Studies [n (%)]
Country		Publication Year
USA Germany Japan Korea China Denmark Spain Italy Other	56(29.9) 16(8.6) 14(7.5) 13(7) 11(5.9) 10(5.3) 9(4.8) 9(4.8) 59(31.6)	2013 2014 2015 2016 2017 2018 2019 2020 2021	8(4.3) 10(5.3) 15(8) 14(7.5) 24(12.8) 26(13.9) 29(15.5) 31(16.6) 30(16)
Type of Diseases		Type of Therapies
Cardiovascular Disease Carcinoma/ Neoplasm Metabolic Disease Rheumatic Disease Nervous System Disease Skin Disease Liver Disease Other diseases	65(34.8) 39(20.9) 19(10.2) 14(7.5) 8(4.3) 7(3.7) 6(3.2) 29(15.5)	Medication Surgical Procedure Chemoradiotherapy Other	144(77) 36(19.3) 5(2.7) 13(16)
Type of Data Sources		Number of databases
Multicenter Registry Single center registry EMR/EHR Claims Other	116(62) 6(3.2) 34(18.2) 11(5.9) 29(15.5)	1 2 3 4 ≥5	121(64.7) 32(17.1) 13(7) 5(2.7) 16(8.6)
Journal IF (n=176)		Citations (n=187)
Median (IQR) Range	3.54(2.4,5.55) 0.52-27.97	Median (IQR) Range	12(6,25) 0-548

IQR, interquartile range; EMR, electronic medical record; EHR, electronic health record; IF, impact factor

Country means the country of the corresponding author; therapies involve exposure group and control group; journal IF is in the publication year of the article.

3.2 Reporting quality of cohort studies using RWD

The mean±SD of the percentage of adequately reported items in the 187 articles was 44.7±14.3 with a range of 11.1% to 87% (disregarding inapplicable items). 72 articles (38.5%) adequately reported 50% and above items, the evaluation details of each article can be obtained in S3 File.

Out of 23 items in total, the adequate reporting rate of 10 items reached 50%, and the reporting rate of some vital items was inadequate. Of 66 (35.3%) studies involving database linkage, 33 (50%), 12 (18.2%), and 28(42.4%) studies adequately reported databases linkage in title or abstract (R1.3), flowcharts or other diagrams to display linkage (R6.3), and the detail of linkage (R12.3) respectively (Table 2). Of the 75 (40.1%) articles that adequately reported codes or algorithms of population selection (R6.1), only 21(28%), 19 (25.3%) articles adequately reported the validation methods (R6.2), and the limitations of codes or algorithms and the validation used to population selection (R19.1c) respectively (Table 2).

In addition, the reporting of some other items was also critically insufficient, such as codes of algorithms (R7.1a:28, 15%; R7.1b:43, 23%; R7.1c:23, 12.3%), data-cleaning methods (R12.2:65, 34.8%), discussion of change in eligibility of results over time (R19.1e:34, 18.2%), and availability of study protocol and raw study (R22.1a:44, 23.5%, R22.1b:38, 20.3%) (Table 2).

Table 2. Reporting of cohort studies using RWD

Items	Brief Descriptions	Reporting [n (%)]			Not Applicable
Items	Brief Descriptions	Yes	Partly Yes	No	Not Applicable
Title and abstract
R1.1a	Type of data	182(97.3)		5(2.7)
R1.1b	Name of databases	134(71.7)	2(1.1)	51(27.2)
R1.2a	Geographic region	117(62.6)		70(37.4)
R1.2b	Time frame	97(51.9)		90(48.1)
R1.3	Linkage between databases	33(50)		33(50)	121
Methods
R6.1	Population selection methods	75(40.1)	112(59.9)
R6.2	Validation of codes/algorithms for population selection	21(28)		54(72)	112
R6.3	Graphical display of data linkage process	12(18.2)		54(81.8)	121
R7.1a	Codes/algorithms of exposures	28(15)	2(1)	157(84)
R7.1b	Codes/algorithms of outcomes	43(23)	34(18.2)	11(5.9)
R7.1c	Codes/algorithms of confounders and effect modifiers	23(12.3)	7(3.7)	157(84)
R12.1	The extent of database accessed	153(81.8)		34(18.2)
R12.2	Data-cleaning methods	65(34.8)		122(65.2)
R12.3	Detail of data linkage	28(42.4)	28(42.4)	10(15.2)	121
Results
R13.1	Detail of population selection	107(57.2)	19(10.2)	61(32.6)
Discussion
R19.1a	Inherent limitations	155(82.9)		32(17.1)
R19.1b	Risk of misclassification	58(31)		129(69)
R19.1c	Limitations of codes/algorithms and the validation	19(25.3)		56(74.7)	112
R19.1d	Implications of the missing variables and missing data	118(63.1)		69(36.9)
R19.1e	Eligibility of results over time	34(18.2)		153(81.8)
Other information
R22.1a	Study protocol availability	44(23.5)		143(76.5)
R22.1b	Raw data availability	38(20.3)		149(79.7)
R22.1c	Supplementary materials availability	100(53.5)		87(46.5)

RWD, real-world data

3.3 Reporting before and after the release of RECORD

Compared to Pre-RECORD period (33 articles), significant improvement in reporting quality for only two items in Post-RECORD period (116 articles): reporting of availability of raw data (R22.1b, p＜0.001), and availability of other supplementary materials (R22.1c, OR: 3.2, 95%CI:1.43-7.16, p=0.004)(Table 3). There was no significant improvement in the overall report quality.

Table 3. Reporting before and after the release of RECORD

Items	Reporting	Studies(n)		OR (95%CI)	*p-value*
Items	Reporting	Pre-RECORD	Post-RECORD	OR (95%CI)	*p-value*
R1.1a	Yes Partly Yes/No	32 1	115 1	3.59(0.22,59.06)	0.395
R1.1b	Yes Partly Yes/No	26 7	84 32	0.71(0.28,1.79)	0.462
R1.2a	Yes Partly Yes/No	18 15	77 39	1.65(0.75,3.61)	0.212
R1.2b	Yes Partly Yes/No	16 17	64 52	1.31(0.6,2.84)	0.497
R1.3	Yes Partly Yes/No	6 10	26 16	2.71(0.83,8.89)	0.095
R6.1	Yes Partly Yes/No	14 19	51 65	1.07(0.49,2.33)	0.875
R6.2	Yes Partly Yes/No	6 8	12 39	0.41(0.12,1.42)	0.152
R6.3	Yes Partly Yes/No	3 13	9 33	1.18(0.28,5.07)	0.822
R7.1a	Yes Partly Yes/No	4 29	23 93	1.79(0.57,5.61)	0.443
R7.1b	Yes Partly Yes/No	8 25	30 86	1.09(0.44,2.68)	0.851
R7.1c	Yes Partly Yes/No	4 29	14 102	1(0.3,3.26)	1
R12.1	Yes Partly Yes/No	27 6	103 13	1.76(0.61,5.06)	0.289
R12.2	Yes Partly Yes/No	11 22	42 74	1.14(0.5,2.57)	0.761
R12.3	Yes Partly Yes/No	5 11	22 20	2.42(0.72,8.18)	0.149
R13.1	Yes Partly Yes/No	19 14	71 45	1.16(0.53,2.55)	0.71
R19.1a	Yes Partly Yes/No	27 6	96 20	1.07(0.39,2.92)	0.9
R19.1b	Yes Partly Yes/No	11 22	33 83	0.8(0.35,1.82)	0.587
R19.1c	Yes Partly Yes/No	6 8	12 39	0.41(0.12,1.42)	0.152
R19.1d	Yes Partly Yes/No	25 8	71 45	0.51(0.21,1.22)	0.123
R19.1e	Yes Partly Yes/No	8 25	21 95	0.69(0.27,1.74)	0.432
R22.1a	Yes Partly Yes/No	9 24	28 88	0.85(0.35,2.04)	0.713
R22.1b	Yes Partly Yes/No	0 33	36 80	--	＜0.001
R22.1c	Yes Partly Yes/No	12 21	75 41	3.2(1.43,7.16)	0.004
Total		--	--	1.17(0.98,1.39)	0.082

RECORD, REporting of studies Conducted using Observational Routinely-collected health Data

Pre-RECORD refers to articles published in 2013-2015, Post-RECORD refers to articles published in 2018-2021.

3.4 Correlations of journal IFs and citations with reporting quality

The journal IFs were significantly higher for adequately reported articles in 6 items: population selection methods (R6.1: p=0.04), codes or algorithms of outcomes (R7.1b: p=0.014), the extent of database accessed (R12.1: p=0.004), eligibility of results over time (R19.1e: p=0.001), availability of raw data and supplementary materials (R22.1b: p＜0.001; and R22.1c: p=0.002)(Table 4). The citations of articles were also significantly higher for adequately reported articles in 6 items: R6.1: p=0.035; R7.1a: p=0.02; R7.1c: p=0.001; R19.1a: p=0.01; R19.1d: p=0.001; R22.1b: p=0.015 (Table 4). In total, the journal IFs and citations were significantly higher for articles with advanced reporting rates (≥50%)(IFs: 4.2 versus 3.17, p=0.002; Citations: 16.5 versus 10, p=0.004)(Table 4).

Table 4. Correlations of journal IFs and citations with reporting quality

Items

Reporting

Journal IFs

Citations

Studies(n)

Median (IQR)

p-value

Studies(n)

Median (IQR)

p-value

R1.1a

Yes

Partly Yes/No

171

3.58(2.39,5.55)

3.19(1.56,4.94)

0.49

182

12(6,26)

12(4.5,15)

0.458

R1.1b

Yes

Partly Yes/No

125

3.71(2.46,5.52)

3.39(2.28,6.24)

0.841

134

12 (7,25.25)

12(4,25.5)

0.493

R1.2a

Yes

Partly Yes/No

110

3.85(2.45,5.56)

3.3(2.35,5.54)

0.446

117

12(5.5,24.5)

12.5(7,26)

0.602

R1.2b

Yes

Partly Yes/No

3.42(2.3,5.17)

3.79(2.46,6.11)

0.376

10(4.5,24.5)

15(7.75,26.25)

0.113

R1.3

Yes

Partly Yes/No

4.4(2.65,5.36)

4.65(3.3,8.62)

0.322

10(4.5,30.5)

17(7,39.5)

0.16

R6.1

Yes

Partly Yes/No

106

4.11(2.65,5.99)

3.24(2.26,5.22)

0.04

112

16(7,32)

10.5(5,19.75)

0.035

R6.2

Yes

Partly Yes/No

4.01(3.14,6.13)

4.15(2.38,5.99)

0.326

23(9,60.5)

15.5(6.75,27.25)

0.088

R6.3

Yes

Partly Yes/No

4.53(3.3,5.66)

4.4(2.9,6.73)

0.906

16.5(4.25,36.5)

13.5(5.75,35)

0.874

R7.1a

Yes

Partly Yes/No

151

4.07(2.78,6.8)

3.4(2.32,5.48)

0.116

159

17(9.5,43.75)

11(6,23)

0.02

R7.1b

Yes

Partly Yes/No

136

4.15(3.23,6.45)

3.24(2.23,5.49)

0.014

144

15(7,41)

11.5(5.25,23)

0.129

R7.1c

Yes

Partly Yes/No

156

4.58(2.82,12.43)

3.42(2.35,5.42)

0.036

164

19(12,66)

10.5(5,23)

0.001

R12.1

Yes

Partly Yes/No

145

3.94(2.55,5.71)

2.6(1.71,4.4)

0.004

153

12(6,27.5)

12.5(5.5,18.75)

0.534

R12.2

Yes

Partly Yes/No

115

4.07(2.62,6.21)

3.28(2.32,5.1)

0.072

122

14(7,29.5)

11(5,23)

0.276

R12.3

Yes

Partly Yes/No

5.16(3.61,8.46)

4.01(2.64,6.07)

0.086

15(4.25,51.5)

13(5,30)

0.559

R13.1

Yes

Partly Yes/No

101

3.95(2.62,5.71)

3.19(2.14,5.49)

0.119

107

11(5,23)

15(6.25,26)

0.628

R19.1a

Yes

Partly Yes/No

145

3.75(2.46,5.59)

3.15(1.95,5.49)

0.133

155

13(7,27)

8.5(3.25,15,75)

0.01

R19.1b

Yes

Partly Yes/No

122

4.02(2.59,6.51)

3.47(2.34,5.22)

0.135

129

15(8,30)

10(5,23)

0.099

R19.1c

Yes

Partly Yes/No

4.07(3.06,5.57)

4.15(2.45,6.29)

0.751

27(9,41)

15.5(6,27.75)

0.09

R19.1d

Yes

Partly Yes/No

112

3.68(2.42,5.6)

3.39(2.33,5.42)

0.476

118

15(8,29.25)

9(3.5,18)

0.001

R19.1e

Yes

Partly Yes/No

146

2.53(1.7,3.77)

3.98(2.55,5.66)

0.001

153

10.5(7.75,20.75)

12(5,25.5)

0.855

R22.1a

Yes

Partly Yes/No

132

3.91(2.38,6.45)

3.47(2.41,5.42)

0.582

143

14.5(7,31.25)

11(6,25)

0.469

R22.1b

Yes

Partly Yes/No

140

4.64(2.78,6.49)

3.26(2.33,5.19)

0.022

149

8(2,18)

14(7,26)

0.015

R22.1c

Yes

Partly Yes/No

4.4(3.03,6.21)

2.65(1.85,4.43)

＜0.001

100

12(6.25,25.75)

12(6,25)

0.681

Total

≥ 50%

＜ 50%

111

4.2(2.97,6.02)

3.17(2.14,5.18)

0.002

115

16.5(8.25,36.5)

10(5,19)

0.004

IFs, impact factors; IQR, interquartile range

Journal IFs of 11 articles were unavailable and therefore not included in the analysis. Citations were obtained from Google Scholar on November 18.

To our knowledge, this is the first time that the complete RECORD checklist was used to evaluate the reporting quality of studies using RWD, with the analysis of the change in reporting quality and its relationship to the journal IFs and citations. We found that only 72 (38.5%) articles adequately reported more than 50% of the items, and some vital items were very insufficiently reported. The overall reporting quality was poor. Some items were reported similarly to analogous studies, such as the codes and algorithms for population selection and their validation studies, as well as data cleaning (R6.1, R6.2, R12.2), while some items were reported worse than analogous studies, such as the codes and algorithms of exposures, outcomes, confounders, and the availability of supplementary materials (R7.1a, R7.1b, R22.1c) [19,25]. While most studies stated that there were inherent limitations when using RWD, there is insufficient discussion of the limitations in some specific areas. Data cleaning, data linkage, and disclosure of relevant information were also severely underreported.

Previous studies have investigated changes in reporting quality before and after the release of other checklists, such as STROBE [26], CONSORT [27], etc. We used similar approaches to compare changes in reporting quality before and after the release of RECORD, and we found statistically significant increases in reporting quality for only two items which suggests years after the release of RECORD, the reporting quality of cohort studies has not improved in total.

The journal IFs and citations were related to some areas and significantly higher in high-reporting quality articles. Our results were compatible with the study conducted by Pol CB van der et al that high-quality studies were cited more frequently [28]. Interestingly, we found that few high-quality articles had low journal IFs and few citations instead, and vice versa. However, our analysis was cursory because the effect of time was not removed, and the newly published articles may have fewer citations.

RWE, on one hand, holds the potential to address important questions [29], on the other hand, there is some controversy due to the issues such as the quality of data sources or studies. The reporting quality issue of concern to this research, to some extent, influences the quality of RWE and its utility in decision-making. Databases may have issues with incomplete, inconsistent, and inaccurate coding, which makes it more challenging and complex to create data linkages and seriously impedes the reproducibility and replicability of studies, and therefore distinct reporting of codes or algorithms and their validation allows critical assessment by readers and benefits the generalization of study findings [30-32]. Linkage of databases can supplement and enrich data sources and transparent reporting can increase confidence in RWE [33,34]. Only 28 (42.4%) articles in the present research reported details such as the level or methods of linkage, yet this may still be insufficient, and some extended guidelines have been developed for more detailed issues of data linkage [34,35]. The disclosure of research-related information is also fundamental to improving the transparency of studies, especially the availability of raw data, which enables readers to assess the authenticity and reliability of the findings.

We believe that establishing a high-quality analytical database with accuracy, completeness, consistency, and wide applicability is the core of acquiring reliable RWE [36], and the crucial elements involve codes or algorithms and their validations, quality of data linkage, data cleaning, and the establishment of data specifications. This necessitates that we concentrate on both the quality of the research process, including methodology and reporting quality, as well as the quality of the data sources, such as standardized data structures and rigorous data quality assessments [37,38]. In the meantime, health data not collected for specific purposes are generally not standardized, in contrast to the strictly conducted RCTs. It is almost impossible to balance all confounders and eliminate the impact of quality problems such as data errors and missing. However, we must still be comprehensively aware of the specific limitations of studies using such data and do everything possible to lessen their impact.

Our research demonstrated the current reporting state of cohort studies using RWD in recent years. There are no restrictions on population and exposure measures, and results have wider applicability. According to our research, we can recognize that there are various issues with this type of study which may be caused by the inadequate dissemination and endorsement of pertinent guidelines, the incomplete methodology consensus, etc. However, we realized that full compliance with the RECORD guidelines is almost impossible in some circumstances because of the technical issues involved in data processing, local policy implications, etc. But for real-world studies, RWD should be made realistic, standardized, and easy to handle from the inception of the study dataset development, otherwise such studies would create more risks of bias different from traditional research methods, which requires the involvement of policy makers, technical personnel, investigators, clinicians, epidemiologists, and methodologists. And at least, researchers can standardize data and research processes as much as possible, we anticipate that our research can promote the spreading of RECORD and suggest the possible direction for researchers to improve RWE quality.

There are some limitations to this research. First, we only used the RECORD checklist to evaluate included articles, and other important aspects of observational studies mentioned in the STROBE checklist were not evaluated, such as details of study design, statistical methods, and reporting of results. Second, some items may not have undergone a strict enough evaluation, such as item R12.3 was deemed sufficient if the author described the level, techniques, and methods of data link or the method to evaluate its quality, which did not demand to be fully detailed. Third, our search strategy cannot retrieve studies that did not mention the “real world” in the paper, consequently, we may overlook many articles that met the criteria, but the final inclusion result was within the acceptable range. Finally, subjectivity may be an inevitable part of the research process. We did not analyze the differences between the results of different reviewers, but we tried to eliminate discrepancies by ensuring the independence of the review and conducting panel discussions.

To conclude, the reporting of cohort studies using RWD was generally inadequate and has not improved in recent years. Journal IFs and article citations were significantly related to the reporting of some areas. We encourage researchers to endorse relevant guidelines when utilizing RWD for research to maximize the value of RWD, obtain high-quality RWE, and prevent misleading clinical decisions.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Availability of data and materials

The full data set is available on request from the corresponding author.

Competing interests

The authors declare that they have no competing interests.

Funding

This work was supported by the National Natural Science Foundation of China (Grant No. 82274685); the Scientific and Technological Innovation Project of China Academy of Chinese Medical Sciences (Grant No. CI2021A01009; the Scientific and Technological Innovation Project of China Academy of Chinese Medical Sciences (Grant No. CI2021A05312).

Authors' contributions

R Zhao and W Zhang: project development, search strategy development, data screening and extraction, evaluation, data analysis, manuscript writing. ZD Zhang and C He: data screening and extraction, evaluation, data rectification. R Xu: search strategy development, data analysis, manuscript writing. B Wang and XD Tang: project development, data rectification, manuscript writing. All authors revised the manuscript and approved the final version.

Acknowledgements

Not applicable.

Khozin S, Blumenthal GM, Pazdur R. Real-world Data for Clinical Evidence Generation in Oncology. J Natl Cancer Inst 2017;109. https://doi.org/10.1093/jnci/djx187.
Gliklich RE, Leavy MB. Assessing Real-World Data Quality: The Application of Patient Registry Quality Criteria to Real-World Data and Real-World Evidence. Ther Innov Regul Sci 2020;54:303–7. https://doi.org/10.1007/s43441-019-00058-6.
US Food and Drug Administration. Real-World Evidence. FDA 2022. https://www.fda.gov/science-research/science-and-research-special-topics/real-world-evidence (accessed November 1, 2022).
Eichler H-G, Pignatti F, Schwarzer-Daum B, Hidalgo-Simon A, Eichler I, Arlett P, et al. Randomized Controlled Trials Versus Real World Evidence: Neither Magic Nor Myth. Clin Pharmacol Ther 2021;109:1212–8. https://doi.org/10.1002/cpt.2083.
Breckenridge AM, Breckenridge RA, Peck CC. Report on the current status of the use of real-world data (RWD) and real-world evidence (RWE) in drug development and regulation. Br J Clin Pharmacol 2019;85:1874–7. https://doi.org/10.1111/bcp.14026.
Thompson D. Replication of Randomized, Controlled Trials Using Real-World Data: What Could Go Wrong? Value Health J Int Soc Pharmacoeconomics Outcomes Res 2021;24:112–5. https://doi.org/10.1016/j.jval.2020.09.015.
Raphael MJ, Gyawali B, Booth CM. Real-world evidence and regulatory drug approval. Nat Rev Clin Oncol 2020;17:271–2. https://doi.org/10.1038/s41571-020-0345-7.
Wang SV, Sreedhara SK, Schneeweiss S, REPEAT Initiative. Reproducibility of real-world evidence studies using clinical practice data to inform regulatory and coverage decisions. Nat Commun 2022;13:5126. https://doi.org/10.1038/s41467-022-32310-3.
Wang SV, Schneeweiss S, Berger ML, Brown J, de Vries F, Douglas I, et al. Reporting to Improve Reproducibility and Facilitate Validity Assessment for Healthcare Database Studies V1.0. Value Health J Int Soc Pharmacoeconomics Outcomes Res 2017;20:1009–22. https://doi.org/10.1016/j.jval.2017.08.3018.
Benchimol EI, Manuel DG, To T, Griffiths AM, Rabeneck L, Guttmann A. Development and use of reporting guidelines for assessing the quality of validation studies of health administrative data. J Clin Epidemiol 2011;64:821–9. https://doi.org/10.1016/j.jclinepi.2010.10.006.
Malone DC, Brown M, Hurwitz JT, Peters L, Graff JS. Real-World Evidence: Useful in the Real World of US Payer Decision Making? How? When? And What Studies? Value Health J Int Soc Pharmacoeconomics Outcomes Res 2018;21:326–33. https://doi.org/10.1016/j.jval.2017.08.3013.
Benchimol EI, Smeeth L, Guttmann A, Harron K, Moher D, Petersen I, et al. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement. PLoS Med 2015;12:e1001885. https://doi.org/10.1371/journal.pmed.1001885.
Langan SM, Schmidt SA, Wing K, Ehrenstein V, Nicholls SG, Filion KB, et al. The reporting of studies conducted using observational routinely collected health data statement for pharmacoepidemiology (RECORD-PE). BMJ 2018;363:k3532. https://doi.org/10.1136/bmj.k3532.
Public Policy Committee, International Society of Pharmacoepidemiology. Guidelines for good pharmacoepidemiology practice (GPP). Pharmacoepidemiol Drug Saf 2016;25:2–10. https://doi.org/10.1002/pds.3891.
The European Network of Centres for Pharmacoepidemiology and Pharmacovigilance (ENCePP). ENCePP Home Page n.d. https://www.encepp.eu/standards_and_guidances/ (accessed November 1, 2022).
von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP, et al. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement: guidelines for reporting observational studies. Int J Surg Lond Engl 2014;12:1495–9. https://doi.org/10.1016/j.ijsu.2014.07.013.
Pouwels KB, Widyakusuma NN, Groenwold RHH, Hak E. Quality of reporting of confounding remained suboptimal after the STROBE guideline. J Clin Epidemiol 2016;69:217–24. https://doi.org/10.1016/j.jclinepi.2015.08.009.
Antwi E, Amoakoh-Coleman M, Vieira DL, Madhavaram S, Koram KA, Grobbee DE, et al. Systematic review of prediction models for gestational hypertension and preeclampsia. PloS One 2020;15:e0230955. https://doi.org/10.1371/journal.pone.0230955.
Hemkens LG, Benchimol EI, Langan SM, Briel M, Kasenda B, Januel J-M, et al. The reporting of studies using routinely collected health data was often insufficient. J Clin Epidemiol 2016;79:104–11. https://doi.org/10.1016/j.jclinepi.2016.06.005.
Wang X, Kattan MW. Cohort Studies: Design, Analysis, and Reporting. Chest 2020;158:S72–8. https://doi.org/10.1016/j.chest.2020.03.014.
Euser AM, Zoccali C, Jager KJ, Dekker FW. Cohort studies: prospective versus retrospective. Nephron Clin Pract 2009;113:c214-217. https://doi.org/10.1159/000235241.
Na D, Sr T, M B, D O, P M, R G. Why observational studies should be among the tools used in comparative effectiveness research. Health Aff Proj Hope 2010;29. https://doi.org/10.1377/hlthaff.2010.0666.
Tepe G, Zeller T, Moscovic M, Corpataux J-M, Christensen JK, Keirse K, et al. Paclitaxel-Coated Balloon for the Treatment of Infrainguinal Disease: 12-Month Outcomes in the All-Comers Cohort of BIOLUX P-III Global Registry. J Endovasc Ther Off J Int Soc Endovasc Spec 2020;27:304–15. https://doi.org/10.1177/1526602819898804.
Scottish Intercollegiate Guidelines Network. Search filters n.d. https://www.sign.ac.uk/what-we-do/methodology/search-filters/ (accessed November 1, 2022).
Yolcu Y, Wahood W, Alvi MA, Kerezoudis P, Habermann EB, Bydon M. Reporting Methodology of Neurosurgical Studies Utilizing the American College of Surgeons-National Surgical Quality Improvement Program Database: A Systematic Review and Critical Appraisal. Neurosurgery 2020;86:46–60. https://doi.org/10.1093/neuros/nyz180.
Rao A, Brück K, Methven S, Evans R, Stel VS, Jager KJ, et al. Quality of Reporting and Study Design of CKD Cohort Studies Assessing Mortality in the Elderly Before and After STROBE: A Systematic Review. PloS One 2016;11:e0155078. https://doi.org/10.1371/journal.pone.0155078.
Plint AC, Moher D, Morrison A, Schulz K, Altman DG, Hill C, et al. Does the CONSORT checklist improve the quality of reports of randomised controlled trials? A systematic review. Med J Aust 2006;185:263–7. https://doi.org/10.5694/j.1326-5377.2006.tb00557.x.
Pol CB van der, McInnes MDF, Petrcich W, Tunis AS, Hanna R. Is Quality and Completeness of Reporting of Systematic Reviews and Meta-Analyses Published in High Impact Radiology Journals Associated with Citation Rates? PLOS ONE 2015;10:e0119892. https://doi.org/10.1371/journal.pone.0119892.
Penberthy LT, Rivera DR, Lund JL, Bruno MA, Meyer A-M. An overview of real-world data sources for oncology and considerations for research. CA Cancer J Clin 2022;72:287–300. https://doi.org/10.3322/caac.21714.
Carter B, Verity Bennett C, Bethel J, Jones HM, Wang T, Kemp A. Identifying cerebral palsy from routinely-collected data in England and Wales. Clin Epidemiol 2019;11:457–68. https://doi.org/10.2147/CLEP.S200748.
Lyu H, Haider A, Landman A, Raut C. The Opportunities and Shortcomings of Using Big Data and National Databases for Sarcoma Research. Cancer 2019;125:2926–34. https://doi.org/10.1002/cncr.32118.
Twiss E, Krijnen P, Schipper I. Accuracy and reliability of injury coding in the national Dutch Trauma Registry. Int J Qual Health Care J Int Soc Qual Health Care 2021;33:mzab041. https://doi.org/10.1093/intqhc/mzab041.
Rivera DR, Gokhale MN, Reynolds MW, Andrews EB, Chun D, Haynes K, et al. Linking electronic health data in pharmacoepidemiology: Appropriateness and feasibility. Pharmacoepidemiol Drug Saf 2020;29:18–29. https://doi.org/10.1002/pds.4918.
Pratt NL, Mack CD, Meyer AM, Davis KJ, Hammill BG, Hampp C, et al. Data linkage in pharmacoepidemiology: A call for rigorous evaluation and reporting. Pharmacoepidemiol Drug Saf 2020;29:9–17. https://doi.org/10.1002/pds.4924.
Gilbert R, Lafferty R, Hagger-Johnson G, Harron K, Zhang L-C, Smith P, et al. GUILD: GUidance for Information about Linking Data sets. J Public Health Oxf Engl 2018;40:191–8. https://doi.org/10.1093/pubmed/fdx037.
Ehsani-Moghaddam B, Martin K, Queenan JA. Data quality in healthcare: A report of practical experience with the Canadian Primary Care Sentinel Surveillance Network data. Health Inf Manag J 2021;50:88–92. https://doi.org/10.1177/1833358319887743.
Reps JM, Schuemie MJ, Suchard MA, Ryan PB, Rijnbeek PR. Design and implementation of a standardized framework to generate and evaluate patient-level prediction models using observational healthcare data. J Am Med Inform Assoc JAMIA 2018;25:969–75. https://doi.org/10.1093/jamia/ocy032.
Blacketer C, Defalco FJ, Ryan PB, Rijnbeek PR. Increasing trust in real-world evidence through evaluation of observational data quality. J Am Med Inform Assoc JAMIA 2021;28:2251–7. https://doi.org/10.1093/jamia/ocab132.

No competing interests reported.

Download PDF

Journal Publication

published 29 Jun, 2023

Read the published version in BMC Medical Research Methodology →

Editorial decision: Major revision
11 May, 2023
Reviews received at journal
08 May, 2023
Reviewers agreed at journal
03 May, 2023
Reviews received at journal
02 Apr, 2023
Reviewers agreed at journal
23 Mar, 2023
Reviewers invited by journal
02 Mar, 2023
Editor assigned by journal
30 Jan, 2023
Editor invited by journal
25 Jan, 2023
Submission checks completed at journal
25 Jan, 2023
First submitted to journal
23 Dec, 2022

You are reading this latest preprint version

Evaluation of reporting quality of cohort studies using real-world data based on RECORD

Status:

Journal Publication

Version 1

Abstract

Figures

Introduction

Methods

2.1. Eligibility of studies

2.2. Search strategy

2.3. Screening and data extraction

2.4. Evaluation

2.5. Data analysis

Results

3.1. Screening results and characteristics of included studies

3.2 Reporting quality of cohort studies using RWD

3.3 Reporting before and after the release of RECORD

3.4 Correlations of journal IFs and citations with reporting quality

Discussion

Declarations

Ethics approval and consent to participate

Consent for publication

Availability of data and materials

Competing interests

Funding

Authors' contributions

Acknowledgements

References

Additional Declarations

Supplementary Files

Status:

Journal Publication

Version 1