Checklist and guidance on creating codelists for routinely collected health data research

Background Codelists are required to extract meaningful information on characteristics and events from routinely collected health data such as electronic health records. Research using routinely collected health data relies on codelists to define study populations and variables, thus, trustworthy codelists are important. Here, we provide a checklist, in the style of commonly used reporting guidelines, to help researchers adhere to best practice in codelist development and sharing. Methods Based on a literature search and a workshop with researchers experienced in the use of routinely collected health data, we created a set of recommendations that are 1. broadly applicable to different datasets, research questions, and methods of codelist creation; 2. easy to follow, implement and document by an individual researcher, and 3. fit within a step-by-step process. We then formatted these recommendations into a checklist. Results We have created a 10-step checklist, comprising 28 items, with accompanying guidance on each step. The checklist advises on which metadata to provide, how to define a clinical concept, how to identify and evaluate existing codelists, how to create new codelists, and how to review, check, finalise, and publish a created codelist. Conclusions Use of the checklist can reassure researchers that best practice was followed during the development of their codelists, increasing trust in research that relies on these codelists and facilitating wider re-use and adaptation by other researchers.


Background
Routinely collected health data are commonly used for epidemiological research, bringing opportunities to address questions not easily answered with clinical trials or research-specific data collection 1 .Routinely collected health data are commonly structured and coded based on dictionary ontologies or clinical vocabularies.These vary widely in scope and specificity of coding; for example International Classification of Diseases 2 has traditionally been used for administrative purposes such as recording of deaths and hospital activity, whereas Systematized Nomenclature of Medicine -Clinical Terms (SNOMED CT) 3 was developed for use in clinical practice and includes a more extensive range of codes.
To extract meaningful information on health-related characteristics and events (e.g., diagnoses, prescriptions, referrals, test results, lifestyle factors, etc.) from routinely collected health data, researchers create codelists (also referred to as clinical codelists, code sets, or value sets) 4 .This is done by identifying relevant codes from the dictionary vocabulary (e.g.all the diagnosis, treatment, referral, etc. codes in SNOMED-CT indicating that a person has diabetes).In studies using routinely collected health data, codelists define the study population, and other variables which researchers will use to answer the research question.Therefore, good practice in codelist development is an essential step in ensuring that codelists accurately capture the health-related characteristics or events of interest.
Checklists are increasingly being used in health research to promote adherence to recommended good practice 5 , including research using routinely collected health data where the REporting of studies Conducted using Observational Routinely-collected Data (RECORD) statement requires "a complete list of codes and algorithms used to classify exposures, outcomes, confounders, and effect modifiers" 6 .While a number of articles already provide guidance on creating, sharing and managing codelists, these focus on specific scenarios (e.g.specific coding systems, or using specific codelist creation tools or methods), or pertain to higher level recommendations (e.g. for organisations, funders, or journals, rather than individual researchers) 4,[7][8][9][10][11] .Thus, we created an easy to use checklist and step-by-step guidance that can be used by researchers using routinely collected health data to ensure good practice.

Patient and Public Involvement
The target audience for this methods paper is researchers who use, or are planning to use, electronic health records for research.Researchers at all stages of their academic careers were involved throughout the project, including in developing objectives.We will involve researchers from a wider group of institutions by encouraging them to participate in the open review process.Patients or the public were not involved in this project.

Checklist development
We formed a codelist task group including the following authors of this paper: JM, KA, AS, L-YL, and HS.All task group members were PhD students or academic staff members at LSHTM.The task group completed an initial literature search in PubMed to identify published papers describing methods and guidance for codelists.The most comprehensive review of the methodological literature on codelists was by Williams in 2017; this provides a set of best practice recommendations for future studies and software tools but did not aim to provide guidance for individual researchers on how to implement these recommendations 4 .We updated this review, using the published search strategy, to find new literature released since 2017 (for a description of this literature search process see Box 1: Updated literature search).We also reviewed recommendations in other pertinent publications identified during this process [8][9][10][11] and features of different codelist sharing websites and general purpose research repositories [12][13][14][15] .

Box 1. Updated literature search
We performed a literature search based on, and using the same search strategy as, the existing review by Williams R, et al., 2017 4 to find new literature released since 2017 on the topic.We did not intend to reevaluate recommendations proposed by Williams et al., rather to identify important new literature on codelists that could be used to inform the creation of our checklist and guidance.We title-and-abstract-screened 427 papers published between June 2017 and December 2022 and indexed in PubMed, of which we full-text-screened 24.From these we excluded papers specifically discussing the transition in the US from ICD9 to ICD10, papers with a higher-level focus on terminologies such as mappings between them but no focus on codelists, and applied papers, including papers that use codelists but do not discuss construction, reuse, validation, or sharing of codelists (as was done in Williams R, et al.,  2017).There remained 9 papers from which we considered recommendations on codelist management.From these papers, we found 2 areas where additional recommendations we considered for inclusion in our checklist and guidance.The two identified topics are as follows: 1.When SNOMED CT is the available terminology, it may be preferrable to avoid "flat" codelists (i.e., a list of all codes to define a concept), in favour of using SNOMED CT concept hierarchies (i.e., a primary concept and its descendants optionally with additional relationships).These concept hierarchies may define more complex concepts (e.g.(Cerebrovascular accident OR History of Cerebrovascular accident) AND NOT Ruptured aneurysm) [16][17][18] .For drugs, it may be possible to use other terminologies such as MeSH, ATC, etc. to create similar concept hierarchies rather than creating "flat" codelists 19 .While a recommendation to make use of concept

Amendments from Version 1
Compared to the previously published version, the checklist contains one additional step (Checks), which prompts users to check their codelists against the database they were created for (internally) or other data sources (externally).Other changes include changing the title and clarifying that the checklist and guidance are relevant for research in routinely collected health data (not just in electronic health records), and additional discussion on codelist repositories and codelists used in clinical practice.
Any further responses from the reviewers can be found at the end of the article hierarchies was already included in the Williams et al. 2017 review which was adapted for our checklist and guidance, we decided not to include guidance specific to the SNOMED-CT terminology, as this did not adhere to our criteria of being broadly applicable to different datasets, research questions, and methods of codelist creation.
2. If available, measures to check the quality of code sets should be made use of.The use of inter-terminology maps is recommended to check for codelists completeness when codelists exist in multiple terminologies (e.g. when creating a codelist in SNOMED CT, map an existing ICD-10 codelist to SNOMED and check for overlap and differences) 20 .However, caution is needed when mapping terms from different ontologies to each other as they may have been created for different purposes (e.g., documentation, billing, registries, referrals or information sharing) and are often used in different care settings (e.g., SNOMED CT in primary care in the UK and ICD-10 codes in secondary care).Some authors propose data centric natural-language processing methods to semiautomatically check codelists, however this will be dependent on the availability of such systems 21 .Within excluded papers, we found multiple recommendations for use of common data models which may address problems with codelists on a higher level, which we did not focus in this work.We mention the use of inter-terminology maps in the guidance section on searching for existing codelists.
Based on these publications and our expertise in using routinely collected health data, the task group drafted an initial checklist, encompassing a set of recommendations on codelist development and sharing that needed to fit the following criteria: 1. broadly applicable to different datasets, research questions, and methods of codelist creation; 2. easy to follow, implement and document by an individual researcher; 3. fit within a step-by-step process where some items should be completed before others.This draft checklist was presented to, and pilot tested on example codelists in a workshop with a wider group of researchers in the Electronic Health records research group at the London School of Hygiene and Tropical Medicine (EHR research group).From this we gathered feedback which was used to further refine recommendations (for a description of this process, see Box 2: Feedback from workshop).Finally, we circulated the checklist to be reviewed and approved by the EHR research group at LSHTM and other stakeholders.

Box 2. Feedback from workshop
The task group convened a small group workshop to understand current codelist reporting practices and improve the process of creation, management, storage and sharing of codelists.All academic staff and PhD student members of the LSHTM Electronic Health Records research group were invited to attend.The workshop was held at the workplace for approximately 3 hours and was facilitated by the task group.Each of 4 groups with 3 to 4 people was provided with an example codelist (that had been employed in previous research), a draft version of the codelist guidance document based on a review of existing literature, and a questionnaire.Each group used the questionnaire to assess the codelist against the provided draft guidelines.Attendees were then asked to provide input to the draft guidelines in a plenary session.The plenary session was structured in two main discussion topics: existing codelists and new codelists.The discussion centred on key themes contained within these discussion topics.The task group took notes during discussions and collated notes from the filled-in questionnaires.Key themes for existing codelists included identifying published codelists and updating existing codelists.Key themes for creating new codelists included defining the clinical concept, creating the codelist, finalising the codelist and sharing the codelist.Several key takeaways emerged from these discussions: 1. Existing codelists: Participants stressed the need to create precise instructions for using previous codelists and updating them effectively.This would involve documenting instances of "absence of" evidence, for example, where no relevant codelists were found.
2. New codelists: Defining the clinical concept: Need for clear processes around defining the clinical concept.
Participants advocated for clearly documenting and versioning iterative searches for synonyms and consulting experts early when defining the clinical concept.The participants stressed that these components should be part of the core documentation provided with the codelist and metadata.
3. Creating codelists: A suggestion was made to provide a cover sheet template to facilitate the implementation of information from the guidance.
4. Sharing codelists: Recognition of authorship: Participants emphasized the need to establish guidelines for recognizing and crediting individuals involved in codelist creation.

Improve knowledge about codelists and coding systems:
The group advocated for an overview of codelists and coding systems to provide context and clarity in their usage.
In summary, the small group workshop discussions yielded valuable insights for enhancing codelist creation, and documentation practices, ultimately aiming to improve the clarity and effectiveness of these processes for better healthcare data management and research.

Ethical consideration
Ethical approval was not required for this study as the current LSHTM policy is that only research activities involving human participants, their data, or their biological material must be submitted to and reviewed by the relevant LSHTM research ethics committee 22 .The workshop is considered a professional involvement activity, and not participation in a study; therefore no informed consent is required.We also confirmed these with the LSHTM ethics team in their response "The current LSHTM policy is that only research activities involving human participants, their data, or their biological material must be submitted to and reviewed by the relevant LSHTM research ethics committee.Approval must be in place before the research starts.We do not expect to review literature reviews as there are no human participants, individual level human data, or biological material.We also do not expect to review public/professional 'involvement' activities.Involvement in research means research that is done 'with' or 'by' the people involved, not 'to', 'for' or 'about' them.It just allows people with relevant experience contribute to how research is designed, conducted and disseminated."

Results
Below we provide a 10-step checklist (Table 1), comprising 28 items, with accompanying guidance on each step.We provide a filled-in example of the checklist in Table 2.

Guidance
Step 1: Define To find or create a suitable codelist, it is necessary to clearly state the following: Firstly, (1a -Concept) state what the codelist intends to capture (e.g., a disease, drug, test results, etc..).Secondly, (1b -Timeframe) state if current (prevalent), new (incident) or previous events are of interest (e.g., a codelist for incident asthma may only aim at capturing codes indicating a first occurrence of asthma not including asthma-related administrative or treatment codes which are likely to indicate ongoing asthma).Thirdly, (1c -Accuracy) state if the codelist should prioritise sensitivity (i.e., includes codes "probably" indicating the clinical phenotype, e.g., "suspected asthma", "referred to asthma clinic") or specificity (e.g., includes codes that "definitely" match the concept)?Finally, (1d -Setting) state where the codes occur (e.g. the health care setting such as primary care or hospital care and what types of codes are included e.g.diagnostic codes, referrals, administrative codes, disease history codes).Together, this information makes up a clinical concept (e.g., "codes definitely describing current or previous asthma in primary care, including diagnostic, treatment, administrative and disease history codes").
Step 2: Search (2a -Sources searched) Existing codelists that match your requirements can be identified (via an internet search (e.g., use a search-engine to search for "asthma codelist CPRD"), a search of publication databases, codelist repositories (e.g., the HDR UK phenotype library) or through existing collaboration and networks.Document which sources were searched.(2b -Existing codelists found) This search does not need to be systematic, but rather should identify codelists that may be directly reused or codelists that can help in creating a new codelist.To choose potentially suitable codelists, check the codelist metadata, including which clinical concept the codelist aims to capture, when the codelist was created, which database it was used in, which terminology, and which version of the terminology was used (as different versions of the same data source and terminology can contain different codes), and if there are any copyright restrictions.Codelists in other terminologies may also be useful, especially if these can be reliably mapped to the terminology of interest; however, this is not always possible.Document which suitable codelists you found.
Step 3: Verify In addition to matching your requirements (in terms of concept, terminology, etc.) the quality of existing codelists needs to be verified.(3a -Verified by others) Identify which information is available, besides the metadata, to allow you to judge if the codelist was created using good practice.Projects or published studies dedicated to, or including codelist validation, may be of particular interest 23 .(3b -Verified by yourself) If available information isn't sufficient to judge the quality of an existing codelist, various checks can be conducted depending on the specific use-case.The codelist may be cross-checked with other existing codelists to verify if different authors consistently include the same codes.A review of the existing codelist may be performed, similar as would be done for a newly created codelist (see Step 7).If you have access to your study data or the number of observations for each code, you may also check the number of records the codelist retrieves, which may be compared to expectations based on clinical knowledge or previous studies.
Step 4: Reference (4a -Existing codelists used) Any existing codelists that are used should be referenced, giving credit to the author(s), and making it easy for others to evaluate your study, or find and adapt the codelist for their own purposes.You should reference whether you have identified a codelist that suits your purposes without modification, whether it required changes to be suitable for your study, or whether it was used to check or inform the creation of a new codelist.You should also state what the existing codelist was originally used for.We suggest wording such as "codelist(s) for [clinical concept] are from/were adapted from/were cross checked with …".References to existing codelist should include the author(s), year, and permanent identifier (such as a DOI, URL or manuscript reference).You may include these references directly as part of this checklist, in your study or codelist repository (see Step 8), or the section of your manuscript or manuscript appendix that describes study variables.
Step 5: Prepare (5a -Synonyms) Identify synonyms and related words to the clinical concept (e.g., "asthma" for an asthma codelist; "stomach/ gastric", "cancer/neoplasm/malignant tumour", etc., for a stomach cancer codelist; "beta-blocker", "beta-adrenoceptor-antagonist", and substance and trade names for a beta-blocker codelist).Consulting and referencing sources of clinical information can be useful.For example Medical Subject headings on PubMed 24 , clinical knowledge summaries and guidelines (such as those provided by the National Institute for Health and Care Excellence (NICE) in the UK 25 ), and websites of patient organisations may all contain useful information.(5b -Exceptions) At this stage, identifying exceptions to the concept that shouldn't be included in the codelist is also important (e.g., if only "allergic" forms of asthma should be included, identify the words "non-allergic", "exercise-induced", etc.).
Step 6: Create In this step, you create and iteratively refine a draft codelist.(6a -Method used) This can be done in a variety of ways.Guidance on the use of specific methods for creating codelists is available elsewhere, including on using Stata scripts 8 , online tools 7 , and for specific use-cases, such as drug codelists 10 .(6b -Search terms) Most approaches will involve searching a dictionary (also referred to as browser) firstly using search terms that correspond to the clinical concept or synonyms thereof, and secondly using exclusion terms to exclude codes that should not be in the codelist.For example, you create a script that searches for a list of predefined search terms (e.g., "asthma", "inhaler", etc..) and then exclude terms based on predefined exclusion terms (e.g., "referral", "review", etc..).Once finalised, report this list of search terms, and if applicable, exclusion terms.(6c -Hierarchy used to extend search) Make use of dictionary hierarchies, e.g., through checking codes that are in the same or a descendant chapter as already included codes, to identify further codes that are related but may have different names or labels (e.g., check which other names for a disease or brand names for drugs may be included in the same Read code or ICD chapter or SNOMED-CT concept).(6d -Decisions made while iterating) When developing the draft codelist, the search should be iteratively refined by repeatedly checking the retrieved and excluded codes and adding terms to the list of search terms and exclusion terms.It may be better to also include codes where you are unsure if they should be in the codelist, as it is easier to exclude codes in the review stage than it is to add codes.Record important decisions made while refining the search, e.g., document the reasons for in-or exclusions.If necessary, revisit the definition of the clinical concept, and record additional decisions in descriptions or comments.(6e -Categories) You may want to specify categories within the codelist, e.g., incident and prevalent codes, more sensitive or specific, only diagnosis codes or diagnosis and administrative codes, (e.g., allowing for the conduct of secondary or sensitivity analyses).
Step 7: Review Your codelist, and how it was created, needs to be reviewed to check for omissions and mistakenly included codes.(7a -Reviewers) A suitable reviewer with relevant knowledge about your clinical concept of interest and experience of the health care setting of your study should be identified.Reviewers may be within your research group, or you may need to reach out to other researchers in the field (e.g., an asthma codelist may be reviewed by a general practitioner, asthma researcher or internal medicine physician).The actual review process can be handled in real time or asynchronously (e.g., via email or a GitHub issue thread).Having multiple reviewers that need to agree on the final codelist can further increase trust in the review process.(7b -Scope of review) The reviewer(s) should first read the description of the clinical concept, then, for each of the codes in the draft codelist, decide if the code is appropriate to include.Reviewing only the codelist, without reviewing the process of how it was generated risks missing codes that should be included; therefore, the method of how the codelist was created should also be reviewed.It is particularly important to give the full list of search terms and exclusion terms (e.g., are all terms included that could possibly refer to asthma?).Make sure to implement all the required changes and re-review if necessary.Whether or not to re-review is up to your judgment, but in general it will be more important when new search terms need to be added as compared to when only a few codes need to be dropped.(7c -Evidence of review) During the review process, interactions between the reviewer(s) and codelist creator(s) should be documented, e.g., via a GitHub Issue thread, or a spreadsheet where reviewers mark each code with yes/no or possible/probable/unlikely (e.g., "referral to asthma clinic", may be marked as codes to be excluded, or codes to be included in a category of "possible asthma").

Step 8: Check
Where possible, code lists should be checked against the database they were created for (internal) or other data sources (external).(8a -Internal checks) Internal methods within the intended database could include the reporting of the numbers of individuals who were identified with the clinical concept of interest and potential sensitivity analyses comparing versions of the code list with different inclusion/exclusion criteria applied.(8b -External checks) External checks could include the comparison of prevalence and incidence measured within the dataset to external literature or a validation study using GP questionnaires to investigating differences between clinical diagnoses and electronic recording.More detail on validation methods can be found in a previous publication 23 .
Step 9: Publish Finally, you should publish your codelist and metadata required by reporting guidelines such as RECORD.You should also publish resources used to create the codelist and related documentation to help readers to review, evaluate or reproduce your study, and reuse or adapt your codelist for future work.(8a -Codelist published) Codelists can be uploaded to general purpose repositories, ideally adhering to FAIR (Findable, Accessible, Interoperable, Reusable) principles 26 .Examples of such repositories include zenodo.orgor the Open Science Framework.You may also be able to adhere to FAIR principles when using your organisation's research output repository, a GitHub or Gitlab repository, or uploading your codelist(s) as supplemental materials to your study.Codelists should be shared in a suitable format that is both human-and machine-readable (.txt, or .csv).(8b -Resources published) Share all resources used to create the codelist, such as search terms, scripts, and references, alongside the codelist.Depending on where the codelist is hosted, there may be predefined fields for metadata, or metadata can be included as part of the checklist.

Discussion
We have developed a checklist to support the creation, adaptation, and re-use of high-quality codelists for research using routinely collected health data, accompanied by step-by-step guidance.These were developed by researchers with relevant expertise and experience including members of the EHR research group at LSHTM, which has employed codelist based data extraction for hundreds of studies for a large range of health-related topics.In Table 2 we include an example of a filled in checklist.
We expect these guidelines to be implemented by a wide range of institutions and research groups, including the EHR group at LSHTM.The guidelines can be used to train new EHR researchers, and develop or strengthen internal guidelines for publishing codelists.Developers of code list sharing platforms will also benefit from these guidelines to identify metadata that is required to allow codelists to be updated and reused.In comparison to previously published recommendations, the checklist and guidance here aim to be as universally applicable as possible within a research context, assuming as little as possible about the way of working, type of codelists to be created, type of terminology used, or tools used to create the codelist.As a consequence, it is not possible to cover every specific case in detail, therefore more narrow guidance may be useful.Examples of more specific guidance include guidance on creating drug codelists 10 , SNOMED-CT codelists using concept hierarchies [16][17][18] , codelists using Stata scripts 8 , codelists using the "termset" method 7 .
The guidance was developed with more challenging coding systems in mind, such as SNOMED-CT and Read codes, which have a complex or overlapping hierarchical structures.The checklist is designed to cope with this complexity, however some steps of the codelist creation process in other settings (e.g. using only ICD coding) may be simplified.
The guidance was developed with research as the use case; however codelists developed for research may end up being used in clinical practice.Further guidance, developed with public, patient, and healthcare worker input, is needed for a clinical care setting to maximise clinical benefit and prevent avoidable harm.This guidance underwent different validation steps 27 , including a literature search, pilot testing and survey of peers.We have published the guidance in NIHR Open Research to support collaboration with the wider EHR community through open peer review, and to enable others to build upon the ideas presented here.Subsequent iterations, subject to funding, should involve pilot testing and input from larger groups of stakeholders, to ensure recommendations are useful for EHR researchers working in a range of different settings and on different topics.
While codelists are shared alongside the majority of studies that use them (a recent review found about 70% reported at least one diagnostic or treatment code), the resources used to create these codelists are rarely shared 28 .Besides journals necessitating (as with analytical code) 29,30 that codelists be published alongside manuscripts, data providers and research organisations should be encouraged to establish and maintain repositories that facilitate sharing of more complete codelist information.Future research may review current codelist banks with a view to improving the completeness of information captured.

Conclusion
Codelists form the foundation of research using routinely collected health data, however they may often be of suboptimal standard, not capturing what they are supposed to capture, and the way in which they are created and shared often precludes reuse and reproducibility.With this work, we provide a checklist, and step-by-step guidance, to help researchers adhere to best practice.

Duncan Edwards
University of Cambridge, Cambridge, England, UK Is the rationale for developing the new method (or application) clearly explained?○ Yes.The need for improving the quality of codelists is well recognised by those involved in this area and this guidance is more robust and detailed guidance than I have seen before, which has tended to be in the form of expert opinion from smaller groups.I suspect many researchers will immediately seek to embed this guidance in their groups' operating procedures.Is the description of the method technically sound?○ Yes.Iterative development and trialling from an expert group, was an appropriate method.It's a weakness that the group contained only folk from LSHTM, though at least this fact is stated.Are sufficient details provided to allow replication of the method development and its use ○ by others?Mostly.I am surprised that for 7b (or perhaps earlier) that some sort of incidence/prevalence check when a draft codelist is applied to a population is not explicitly included (i.e.comparing vs previous estimates in other studies, or versus clinical judgement by the reviewer of what a reasonable number and the gender/age breakdown would be) -in my experience this data exploration stage is where a lot of major bugs with codelists are found.The description of 7b seems insufficiently detailed without explicit mention of this issue (though perhaps it is best covered elsewhere, I'm not sure 7b is the right place) -please consider a minor amendment.
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?

○
No relevant source data.
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?○ Yes.However, there is no mention of the subsequent use of codelists in clinical practice.This is a huge issue in itself, and codelists developed based upon research papers can extremely swiftly be used in clinical practice to great population benefit (but also avoidable harm).e.g.https://www.judiciary.uk/prevention-of-future-death-reports/alexander-reid-prevention-offuture-deaths-report/I'm not arguing that this paper should cover this issue (perhaps akin to the phase IV monitoring after drug trials) in great depth, because it is such a complex and important issue in its own right.However, zero mention of the issue seems inappropriate and fails to highlight to researchers their role and responsibility in the subsequent use by others of their codelists.So I would strongly suggest an amendment to the paper, albeit small, to mention this issue.Perhaps a paragraph in the discussion recognising the importance of phase IV monitoring of codelists subsequently used on the public, and the need for public, patient and frontline health worker engagement in developing best practice in this area too (perhaps the authors will tackle this in a subsequent paper!?).

Is the rationale for developing the new method (or application) clearly explained? Yes
Is the description of the method technically sound?Yes If any results are presented, are all the source data underlying the results available to ensure full reproducibility?No relevant source data.COMMENT 3.4: Are the conclusions about the method and its performance adequately supported by the findings presented in the article?○ Yes.However, there is no mention of the subsequent use of codelists in clinical practice.This is a huge issue in itself, and codelists developed based upon research papers can extremely swiftly be used in clinical practice to great population benefit (but also avoidable harm).e.g.https://www.judiciary.uk/prevention-of-future-death-reports/alexander-reidprevention-of-future-deaths-report/I'm not arguing that this paper should cover this issue (perhaps akin to the phase IV monitoring after drug trials) in great depth, because it is such a complex and important issue in its own right.However, zero mention of the issue seems inappropriate and fails to highlight to researchers their role and responsibility in the subsequent use by others of their codelists.So I would strongly suggest an amendment to the paper, albeit small, to mention this issue.Perhaps a paragraph in the discussion recognising the importance of phase IV monitoring of codelists subsequently used on the public, and the need for public, patient and frontline health worker engagement in developing best practice in this area too (perhaps the authors will tackle this in a subsequent paper!?).
AUTHOR RESPONSE 3.4: Many thanks for alerting us to the potential use of research codelists in clinical care.We have now highlighted this issue in the discussion.
AUTHOR CHANGE 3.4: added to discussion p16: The guidance was developed with research as the use case; however codelists developed for research may end up being used in clinical practice.Further guidance, developed with public, patient, and healthcare worker input, is needed for a clinical care setting to maximize clinical benefit and prevent avoidable harm.
Is the rationale for developing the new method (or application) clearly explained?Yes Is the description of the method technically sound?Yes COMMENT 3.5: Are sufficient details provided to allow replication of the method development and its use by others?○ Partly AUTHOR RESPONSE 3.5: We believe that our methods are clearer following our review of the manuscript.For example, we added a summary of questionnaire responses to the manuscript.
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?No source data required Are the conclusions about the method and its performance adequately supported by the findings presented in the article?Yes well it performed against these, would be useful for the next user of the code list and would encourage local teams to be cautious about applying code lists without checking them.
AUTHOR RESPONSE 2.5: Thank you for this comment.We agree with you and encourage the reader to trial their codelist against internal and external data.Your comment aligns with reviewer 1 comment 1.7; we have considered it in our response.
Competing Interests: No competing interests were disclosed.

Shirley Wang
Howard Hughes Medical Institute -Harvard Medical School, Boston, Massachusetts, USA This paper describes a checklist for code list development and sharing of the code lists after creation.This is very important work, and the authors should be commended for the thoughtfully developed and comprehensive checklist for creating code lists.While the questionnaire and example code lists are provided in supplemental material, the results of the questionnaire that were used to support discussion and adaptation of the checklist are not.
Minor comments: I would contend that this checklist applies not only to electronic health records research, but also to claims based research.
All members of the task group and workshop were from LSHTM.However, the potential impact of this paper could have been far larger if the authors engaged members/perspectives outside of their own organization, from groups that have been thinking deeply about and already actively implementing processes for sharing code lists.For example, representation from groups like OHDSI, Sentinel, CPRD, N3C, etc.Such groups have experiences, preferences, and pipelines that may differ from experiences at LSHTM that would be helpful to inform what criteria would represent "good practice" for code list development.A task force gathering of experts could have been facilitated through a research society like the International Society for Pharmacoepidemiology or the International Society for Pharmacoeconomics Research.Just a thought for future work in this space.
The recommendations are predicated on investigators intending to share code lists (which I very much support!)It would be helpful to have some discussion of the current prevalence of code list sharing (probably quite low).Do the authors have thoughts on how to encourage more routine citation of code lists?What other encouragement or culture shifts would be necessary for researchers outside LSHTM to routinely share their code lists?
Code list creation based on mapping from other ontologies is mentioned.A little more discussion or cautionary words about the hazards of doing so given imperfect mappings and unmappable concepts could be helpful.
Step 3 Verify.It would be helpful to emphasize more the importance of documenting performance characteristics of the code list (whether existing or new) or documenting the absence of such performance characteristics.If performance characteristics are available, then some details about the population in which it was validated would also be useful, for the user to understand how applicable those measurement characteristics would be to the population in which the code list is used.

Is the rationale for developing the new method (or application) clearly explained? Yes
Is the description of the method technically sound?Yes

Are sufficient details provided to allow replication of the method development and its use by others? Yes
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?Partly Are the conclusions about the method and its performance adequately supported by the findings presented in the article?Yes Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Phamacoepidemiology, meta-research I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
code lists after creation.This is very important work, and the authors should be commended for the thoughtfully developed and comprehensive checklist for creating code lists.
AUTHOR RESPONSE 1.1:Many thanks for reviewing our manuscript and for your kind comments and expert advice.
COMMENT 1.2:While the questionnaire and example code lists are provided in supplemental material, the results of the questionnaire that were used to support discussion and adaptation of the checklist are not.

AUTHOR RESPONSE 1.2:
We have added the meeting minutes and transcribed questionnaire responses to the study's online repository.

AUTHOR CHANGE 1.2:
No changes to the manuscript necessary.The repository DOI cited in the manuscript represents all versions, and now resolves to the latest version of the repository.
Minor comments COMMENT 1.3:I would contend that this checklist applies not only to electronic health records research, but also to claims based research.

AUTHOR RESPONSE 1.3:
We agree that the checklist applies to all routinely collected database studies that require the generation of codelists to answer the study question, especially those that use complex dictionaries or ontologies such as SNOMED or ATC codes that are used in many databases worldwide.We changed our terminology throughout the manuscript, and we are now using the term "routinely collected health data" instead of "electronic health records" to emphasise that this checklist applies to all types of routinely collected health data including claims data as well.

AUTHOR CHANGE 1.3:
We changed the term "electronic health records" into "routinely collected health data" throughout the manuscript.
COMMENT 1.4:All members of the task group and workshop were from LSHTM.However, the potential impact of this paper could have been far larger if the authors engaged members/perspectives outside of their own organization, from groups that have been thinking deeply about and already actively implementing processes for sharing code lists.For example, representation from groups like OHDSI, Sentinel, CPRD, N3C, etc.Such groups have experiences, preferences, and pipelines that may differ from experiences at LSHTM that would be helpful to inform what criteria would represent "good practice" for code list development.A task force gathering of experts could have been facilitated through a research society like the International Society for Pharmacoepidemiology or the International Society for Pharmacoeconomics Research.Just a thought for future work in this space.
AUTHOR RESPONSE 1.4: Our working group initially aimed to recommend an existing codelist bank for everyone to use or develop a codelist bank for LSHTM's EHR group.We quickly realised how complex and ambitious this task was, especially given the lack of allocated funding or time.We therefore prioritised developing broad internal recommendations for codelist creation and sharing that are applicable to multiple use cases.We have shared this work in our manuscript.A future mission is to improve the quality and availability of codelist banks.We are open to working with a wider group of experts to secure funding for this work, and agree that this would increase the relevance and impact of our work AUTHOR CHANGE 1.4: see comment 5 below as the changes are linked.
COMMENT 1.5:The recommendations are predicated on investigators intending to share code lists (which I very much support!)It would be helpful to have some discussion of the current prevalence of code list sharing (probably quite low).Do the authors have thoughts on how to encourage more routine citation of code lists?What other encouragement or culture shifts would be necessary for researchers outside LSHTM to routinely share their code lists?
AUTHOR RESPONSE 1.5 : Thank you for your comment.There has been a welcome recent push towards open science and reproducibility by publishing both programming code and codelists.A requirement for authors to publish all programming code and study code lists with any manuscript might help to create a culture of routinely sharing code lists.Additionally, encouraging data providers and research organisations to establish and maintain accessible repositories for these resources would further enhance this.An example of such an initiative is OpenSAFELY and OpenCodelists, where the sharing of codelists and programming code is an integral part of the study workflow.These platforms facilitate reproducibility and improve the overall transparency of medical research.
AUTHOR CHANGE 1.5: Added discussion (p16): While codelists are shared alongside the majority of studies that use them (a recent review found about 70% reported at least one diagnostic or treatment code), the resources used to create these codelists are rarely shared 28 .Besides journals necessitating (as with analytical code) 29,30 that codelists be published alongside manuscripts, data providers and research organisations should be encouraged to establish and maintain repositories that facilitate sharing of more complete codelist information.Future research may review current codelist banks with a view to improving the completeness of information captured.
sensitivity, specificity or positive predictive value of a certain code list.More detail on potential options for conducting a code list validation can be found in a previous publication (Herrett et al.).

© 1 Reviewer
2024 Wang S. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Shirley Wang Howard Hughes Medical Institute -Harvard Medical School, Boston, Massachusetts, USA The authors adequately addressed my comments Competing Interests: I have been an ad hoc consultant for Exponent Inc, Cytel Inc, and MITRE a federally funded research and development center for the Centers for Medicare and Medicaid Services Reviewer Expertise: Phamacoepidemiology, meta-research I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.Version Report 24 June 2024 https://doi.org/10.3310/nihropenres.14709.r31892© 2024 Edwards D. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Are sufficient details provided to allow replication of the method development and its use by others?
Partly If any results are presented, are all the source data underlying the results available to ensure full reproducibility?No source data required Are the conclusions about the method and its performance adequately supported by the findings presented in the article?Yes Competing Interests: No competing interests were disclosed.

Table 1 . Checklist. Step No Item Information to be provided Metadata
c. Accuracy Should the codelist capture probable or definite codes?d.Setting What is the (health care) setting (e.g., primary care, hospital care)?Identify and evaluate existing codelists Search 2 a.Sources searched Which sources were searched (e.g., internet search, codelist repositories)? b.Existing codelists found Which suitable codelists did you find?Verify 3 a.Verified by others Which information is available to verify the quality of suitable codelists?b.Verified by yourself Which checks did you conduct to verify the quality of suitable codelists?Reference 4 a.Existing codelists used Are you making use of any existing codelists?If yes, reference these, and specify how they are being used.Create a new codelist Prepare 5 a. Synonyms What are synonyms and related words for the clinical concept (e.g., different names for a disease/drug) and how did you identify these (e.g., source of clinical knowledge)?b.Exceptions What should not be included in the codelist?Create 6 a. Method used Which method (e.g., a script, a tool) did you use to create the draft codelist?b.Search terms Which search terms, and if applicable, exclusion terms did you use? c.Hierarchy used to extend search Did you use a dictionary hierarchy (e.g., ICD-10 chapters, SNOMED-CT concepts) to modify your search?If yes, specify.d.Decisions made while iterating Which decisions did you make while iteratively refining the draft codelist?e. (Optional) Categories Did you specify subcategories within the codelist?If yes, specify.b.External checks What method(s) were used for external checks, if any, and what are the findings?Publish 9 a.Codelist published Where is the codelist published?b.Resources published Where are the resources used to create the codelist (e.g., scripts, list of terms)?

Table 2 . Example of filled in checklist. Step No Item Information to be provided
Identified a number of codelists but none for CPRD Aurum; one study describing validation of eczema codelists was found: Abuabara et al.